What character encoding is used for Linux configuration files? The 2019 Stack Overflow...

The phrase "to the numbers born"?

Is it ethical to upload a automatically generated paper to a non peer-reviewed site as part of a larger research?

Can you cast a spell on someone in the Ethereal Plane, if you are on the Material Plane and have the True Seeing spell active?

Can we generate random numbers using irrational numbers like π and e?

Accepted by European university, rejected by all American ones I applied to? Possible reasons?

How can I define good in a religion that claims no moral authority?

Mathematics of imaging the black hole

Why are there uneven bright areas in this photo of black hole?

Can a flute soloist sit?

How to display lines in a file like ls displays files in a directory?

Worn-tile Scrabble

Deal with toxic manager when you can't quit

Does adding complexity mean a more secure cipher?

Inverse Relationship Between Precision and Recall

A word that means fill it to the required quantity

Is an up-to-date browser secure on an out-of-date OS?

Can an undergraduate be advised by a professor who is very far away?

How to notate time signature switching consistently every measure

Did the UK government pay "millions and millions of dollars" to try to snag Julian Assange?

Button changing its text & action. Good or terrible?

writing variables above the numbers in tikz picture

How to charge AirPods to keep battery healthy?

Why can't devices on different VLANs, but on the same subnet, communicate?

What could be the right powersource for 15 seconds lifespan disposable giant chainsaw?



What character encoding is used for Linux configuration files?



The 2019 Stack Overflow Developer Survey Results Are InWhat charset encoding is used for filenames and paths on Linux?ssh and character encodingPrevent access to files on linux file serverCharacter encoding issue with my linux install?Which configuration files override /etc/default/locale?Is there any configuration validator for linux?Capturing UNIX/ Linux server configuration?What is BROWSER_ONLY option in ifcfg-* network configuration files?Which is the “standard” configuration parser library used in Linux?Linux network configuration: A can of worms?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







0















A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)



This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.



But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?



There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.




  • /etc/network/interfaces

  • /etc/ntp.conf

  • /etc/hostname

  • ...


Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?










share|improve this question







New contributor




JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.

    – Johan Myréen
    12 hours ago











  • If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)

    – JasonA
    10 hours ago













  • If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.

    – Johan Myréen
    9 hours ago


















0















A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)



This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.



But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?



There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.




  • /etc/network/interfaces

  • /etc/ntp.conf

  • /etc/hostname

  • ...


Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?










share|improve this question







New contributor




JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.

    – Johan Myréen
    12 hours ago











  • If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)

    – JasonA
    10 hours ago













  • If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.

    – Johan Myréen
    9 hours ago














0












0








0


1






A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)



This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.



But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?



There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.




  • /etc/network/interfaces

  • /etc/ntp.conf

  • /etc/hostname

  • ...


Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?










share|improve this question







New contributor




JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












A colleague was using Qt's built-in QTextStream class to rewrite the /etc/network/interfaces file on an Ubuntu system. Part of that code included a call to QTextStream's setCodec() method, where the codec was set to UTF-8. (see https://doc.qt.io/qt-5/qtextstream.html#setCodec if you're curious)



This got me wondering about what the Linux configuration files are SUPPOSED to be written as. It seems like ISO 8859-1 would be the closest to what I'd consider "plain ASCII" style of text, and I would (perhaps naively) assume this to be correct since most configurations files are plain English with no need for much more than the basic alphabet, numbers and a few punctuation signs.



But then I also wonder what would someone from a non-English speaking country do if they wanted to put comments into such files using other characters that aren't in ISO-8859-. Are they just plain "out of luck" ?



There are obviously a lot of "standard" configuration files that you'd find on an Ubuntu/Linux system, e.g.




  • /etc/network/interfaces

  • /etc/ntp.conf

  • /etc/hostname

  • ...


Would anyone care to weigh in on what encoding is actually supported/expected in these sort of files ? And where this is actually documented ? Is it enshrined in some sort of "Linux developers manifesto" as something writers of new Linux system services should be following, and if so, where would I find a definitive source of that information ?







linux configuration locale






share|improve this question







New contributor




JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 14 hours ago









JasonAJasonA

6




6




New contributor




JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






JasonA is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.

    – Johan Myréen
    12 hours ago











  • If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)

    – JasonA
    10 hours ago













  • If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.

    – Johan Myréen
    9 hours ago



















  • UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.

    – Johan Myréen
    12 hours ago











  • If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)

    – JasonA
    10 hours ago













  • If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.

    – Johan Myréen
    9 hours ago

















UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.

– Johan Myréen
12 hours ago





UTF-8 is as close to plain ASCII as is ISO-8859-1, in that both contain ASCII as a subset. Both encodings produce identical results if you restrict the text to plain ASCII. ISO-8859-1 has the problem, as you point out yourself, that ISO-8859-1 is a much more restricted encoding. IMHO, the 8-bit ISO-8859 encodings are a thing of the past and should be phased out.

– Johan Myréen
12 hours ago













If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)

– JasonA
10 hours ago







If a particular service, for example the NTP daemon, is only written with ASCII in mind when it reads /etc/ntp.conf, what is going to happen if someone embeds UTF-8 non-ASCII characters (e.g. in a comment). Is it explicitly doing UTF-8 aware processing of the configuration file (by design), or is it just "dumb luck" that it works ? That's what I'm trying to understand here. Obviously there are a lot of "moving pieces" so I can't just read all their source code to figure this out. That's why I was looking for some sort of "recipe" document that they are all following (hopefully !)

– JasonA
10 hours ago















If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.

– Johan Myréen
9 hours ago





If the program that reads the configuration file expects plain ASCII, then I would say the chance it chokes on ISO-8859 is just as big as it is with UTF-8. If the non-ASCII characters are in comments, the chance is probably quite small.

– Johan Myréen
9 hours ago










1 Answer
1






active

oldest

votes


















0














The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway






share|improve this answer
























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    JasonA is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f511919%2fwhat-character-encoding-is-used-for-linux-configuration-files%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway






    share|improve this answer




























      0














      The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway






      share|improve this answer


























        0












        0








        0







        The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway






        share|improve this answer













        The general Encoding can be set via the LANG environment variable, but by now nearly all Linux distros and tools have migrated to UTF-8. The main advantage for configuration files is, that any string using only ASCII characters are valid ASCII. So for most configuration files it doesn't really matter, since they only use those characters anyway







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 13 hours ago









        MetalfreakMetalfreak

        815




        815






















            JasonA is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            JasonA is a new contributor. Be nice, and check out our Code of Conduct.













            JasonA is a new contributor. Be nice, and check out our Code of Conduct.












            JasonA is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f511919%2fwhat-character-encoding-is-used-for-linux-configuration-files%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

            Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

            Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...