Several questions about file-system character encoding on linuxUnderstanding Unix file name encodingIs it...

How do I explain that I don't want to maintain old projects?

Tikz, arrow formatting

Other Space Shuttle O-ring failures

Is it possible for a character at any level to cast all 44 Cantrips in one week without Magic Items?

Is it ok for parents to kiss and romance with each other while their 2- to 8-year-old child watches?

Writing an ace/aro character?

Array or vector? Two dimensional array or matrix?

Sense of humor in your sci-fi stories

Four ships at the ocean with the same distance

Why is a mixture of two normally distributed variables only bimodal if their means differ by at least two times the common standard deviation?

How to execute a program from terminal redirecting stdout and stderr to systemd's journal?

How do resistors generate different heat if we make the current fixed and changed the voltage and resistance? Notice the flow of charge is constant

Strong Password Detection in Python

Can you create a free-floating MASYU puzzle?

How was the website able to tell my credit card was wrong before it processed it?

Is there a formal/better word than "skyrocket" for the given context?

What are the consequences for a developed nation to not accept any refugees?

Clarinets in the Rite of Spring

How can I use my cell phone's light as a reading light?

How should I ask for a "pint" in countries that use metric?

How does one acquire an undead eyeball encased in a gem?

What exactly is a "murder hobo"?

Is it possible to complete a PhD in CS in 3 years?

Blocks from @ jafe



Several questions about file-system character encoding on linux


Understanding Unix file name encodingIs it possible to add CP-1252 support to CentOS?bulk rename (or correctly display) files with special charactersUnderstanding Unix file name encodingopendir and readdir encoding strings behind my back?Is it possible to add CP-1252 support to CentOS?Which terminal encodings are default on Linux, and which are most common?File weird character encodingjumbled character setHow to deal with characters like “:” or “?” that make invalid filenames?Why do my file names look 'normal' in Linux but not remotely on Windows?Can't read former internal hdd file system via usb (fedora volume group name conflict)






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







12















Due to a lot of file exchange works between Windows (GBK encoding) and Linux (UTF-8 encoding), it will encounter character encoding issues easily, such as:




  • zip/tar files whose name contains chinese characters on Windows system, unzip/untar it in Linux system.

  • run migrated legacy java web application (designed on Windows system, using GBK encoding in JSP) which write GBK-encoding-named files to disk.

  • ftp get/put GBK-encoding-named files between Windows FTP server and Linux client.

  • switch LANG environment in Linux.


The common issue of the previous mentioned are file locating/naming. After googled, I got an article Using Unicode in Linux
http://www.linux.com/archive/feed/39912, it said:




the operating system and many utilities do not realize what characters the bytes in file names represent.




So, it's possible to have 2 中文.txt files with different encoding:



[root@fedora test]# ls
???? 中文
[root@fedora test]# ls | iconv -f GBK
中文
涓iconv: illegal input sequence at position 7
[root@fedora test]# ls 中文 && ls $'xd6xd0xcexc4'|iconv -f gbk
中文
中文


Questions:




  1. Is it possible to config linux filesystem use fixed character encoding (like NTFS use UTF-16 internally) to store file names regardless of LANG/LC_ALL environment?

  2. Or, what I actually want ask is: Is it possible to let file name 中文.txt ($'xe4xb8xadxe6x96x87.txt') in zh_CN.UTF-8 environment and file name 中文.txt ($'xd6xd0xcexc4.txt') in zh_CN.GBK environment refer to same file?

  3. If it's not configurable, then is it possible to patch kernel to translate character encoding between file-system and current environment (just a question, not request implementation)? and how much performance con effect if it's possible?










share|improve this question

























  • You could tackle the problem from the Windows side by using Cygwin 1.7, which does automatically translate between the filesystem's UTF-16 encoding and whatever encoding has been specified in the locale settings. It defaults to UTF-8, so for example Cygwin tar would encode filenames as UTF-8.

    – ak2
    Jun 24 '11 at 8:13











  • @ak2 Thanks, Cygwin is really good, I've been use it for years. The tar/zip case is just an example, in real environment, the zip/tar files may be created by others (such as download a file from internet).

    – LiuYan 刘研
    Jun 24 '11 at 8:34


















12















Due to a lot of file exchange works between Windows (GBK encoding) and Linux (UTF-8 encoding), it will encounter character encoding issues easily, such as:




  • zip/tar files whose name contains chinese characters on Windows system, unzip/untar it in Linux system.

  • run migrated legacy java web application (designed on Windows system, using GBK encoding in JSP) which write GBK-encoding-named files to disk.

  • ftp get/put GBK-encoding-named files between Windows FTP server and Linux client.

  • switch LANG environment in Linux.


The common issue of the previous mentioned are file locating/naming. After googled, I got an article Using Unicode in Linux
http://www.linux.com/archive/feed/39912, it said:




the operating system and many utilities do not realize what characters the bytes in file names represent.




So, it's possible to have 2 中文.txt files with different encoding:



[root@fedora test]# ls
???? 中文
[root@fedora test]# ls | iconv -f GBK
中文
涓iconv: illegal input sequence at position 7
[root@fedora test]# ls 中文 && ls $'xd6xd0xcexc4'|iconv -f gbk
中文
中文


Questions:




  1. Is it possible to config linux filesystem use fixed character encoding (like NTFS use UTF-16 internally) to store file names regardless of LANG/LC_ALL environment?

  2. Or, what I actually want ask is: Is it possible to let file name 中文.txt ($'xe4xb8xadxe6x96x87.txt') in zh_CN.UTF-8 environment and file name 中文.txt ($'xd6xd0xcexc4.txt') in zh_CN.GBK environment refer to same file?

  3. If it's not configurable, then is it possible to patch kernel to translate character encoding between file-system and current environment (just a question, not request implementation)? and how much performance con effect if it's possible?










share|improve this question

























  • You could tackle the problem from the Windows side by using Cygwin 1.7, which does automatically translate between the filesystem's UTF-16 encoding and whatever encoding has been specified in the locale settings. It defaults to UTF-8, so for example Cygwin tar would encode filenames as UTF-8.

    – ak2
    Jun 24 '11 at 8:13











  • @ak2 Thanks, Cygwin is really good, I've been use it for years. The tar/zip case is just an example, in real environment, the zip/tar files may be created by others (such as download a file from internet).

    – LiuYan 刘研
    Jun 24 '11 at 8:34














12












12








12


5






Due to a lot of file exchange works between Windows (GBK encoding) and Linux (UTF-8 encoding), it will encounter character encoding issues easily, such as:




  • zip/tar files whose name contains chinese characters on Windows system, unzip/untar it in Linux system.

  • run migrated legacy java web application (designed on Windows system, using GBK encoding in JSP) which write GBK-encoding-named files to disk.

  • ftp get/put GBK-encoding-named files between Windows FTP server and Linux client.

  • switch LANG environment in Linux.


The common issue of the previous mentioned are file locating/naming. After googled, I got an article Using Unicode in Linux
http://www.linux.com/archive/feed/39912, it said:




the operating system and many utilities do not realize what characters the bytes in file names represent.




So, it's possible to have 2 中文.txt files with different encoding:



[root@fedora test]# ls
???? 中文
[root@fedora test]# ls | iconv -f GBK
中文
涓iconv: illegal input sequence at position 7
[root@fedora test]# ls 中文 && ls $'xd6xd0xcexc4'|iconv -f gbk
中文
中文


Questions:




  1. Is it possible to config linux filesystem use fixed character encoding (like NTFS use UTF-16 internally) to store file names regardless of LANG/LC_ALL environment?

  2. Or, what I actually want ask is: Is it possible to let file name 中文.txt ($'xe4xb8xadxe6x96x87.txt') in zh_CN.UTF-8 environment and file name 中文.txt ($'xd6xd0xcexc4.txt') in zh_CN.GBK environment refer to same file?

  3. If it's not configurable, then is it possible to patch kernel to translate character encoding between file-system and current environment (just a question, not request implementation)? and how much performance con effect if it's possible?










share|improve this question
















Due to a lot of file exchange works between Windows (GBK encoding) and Linux (UTF-8 encoding), it will encounter character encoding issues easily, such as:




  • zip/tar files whose name contains chinese characters on Windows system, unzip/untar it in Linux system.

  • run migrated legacy java web application (designed on Windows system, using GBK encoding in JSP) which write GBK-encoding-named files to disk.

  • ftp get/put GBK-encoding-named files between Windows FTP server and Linux client.

  • switch LANG environment in Linux.


The common issue of the previous mentioned are file locating/naming. After googled, I got an article Using Unicode in Linux
http://www.linux.com/archive/feed/39912, it said:




the operating system and many utilities do not realize what characters the bytes in file names represent.




So, it's possible to have 2 中文.txt files with different encoding:



[root@fedora test]# ls
???? 中文
[root@fedora test]# ls | iconv -f GBK
中文
涓iconv: illegal input sequence at position 7
[root@fedora test]# ls 中文 && ls $'xd6xd0xcexc4'|iconv -f gbk
中文
中文


Questions:




  1. Is it possible to config linux filesystem use fixed character encoding (like NTFS use UTF-16 internally) to store file names regardless of LANG/LC_ALL environment?

  2. Or, what I actually want ask is: Is it possible to let file name 中文.txt ($'xe4xb8xadxe6x96x87.txt') in zh_CN.UTF-8 environment and file name 中文.txt ($'xd6xd0xcexc4.txt') in zh_CN.GBK environment refer to same file?

  3. If it's not configurable, then is it possible to patch kernel to translate character encoding between file-system and current environment (just a question, not request implementation)? and how much performance con effect if it's possible?







linux filesystems filenames character-encoding






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 51 mins ago









Todd Sewell

1052 bronze badges




1052 bronze badges










asked Jun 22 '11 at 10:09









LiuYan 刘研LiuYan 刘研

2,1455 gold badges22 silver badges30 bronze badges




2,1455 gold badges22 silver badges30 bronze badges













  • You could tackle the problem from the Windows side by using Cygwin 1.7, which does automatically translate between the filesystem's UTF-16 encoding and whatever encoding has been specified in the locale settings. It defaults to UTF-8, so for example Cygwin tar would encode filenames as UTF-8.

    – ak2
    Jun 24 '11 at 8:13











  • @ak2 Thanks, Cygwin is really good, I've been use it for years. The tar/zip case is just an example, in real environment, the zip/tar files may be created by others (such as download a file from internet).

    – LiuYan 刘研
    Jun 24 '11 at 8:34



















  • You could tackle the problem from the Windows side by using Cygwin 1.7, which does automatically translate between the filesystem's UTF-16 encoding and whatever encoding has been specified in the locale settings. It defaults to UTF-8, so for example Cygwin tar would encode filenames as UTF-8.

    – ak2
    Jun 24 '11 at 8:13











  • @ak2 Thanks, Cygwin is really good, I've been use it for years. The tar/zip case is just an example, in real environment, the zip/tar files may be created by others (such as download a file from internet).

    – LiuYan 刘研
    Jun 24 '11 at 8:34

















You could tackle the problem from the Windows side by using Cygwin 1.7, which does automatically translate between the filesystem's UTF-16 encoding and whatever encoding has been specified in the locale settings. It defaults to UTF-8, so for example Cygwin tar would encode filenames as UTF-8.

– ak2
Jun 24 '11 at 8:13





You could tackle the problem from the Windows side by using Cygwin 1.7, which does automatically translate between the filesystem's UTF-16 encoding and whatever encoding has been specified in the locale settings. It defaults to UTF-8, so for example Cygwin tar would encode filenames as UTF-8.

– ak2
Jun 24 '11 at 8:13













@ak2 Thanks, Cygwin is really good, I've been use it for years. The tar/zip case is just an example, in real environment, the zip/tar files may be created by others (such as download a file from internet).

– LiuYan 刘研
Jun 24 '11 at 8:34





@ak2 Thanks, Cygwin is really good, I've been use it for years. The tar/zip case is just an example, in real environment, the zip/tar files may be created by others (such as download a file from internet).

– LiuYan 刘研
Jun 24 '11 at 8:34










2 Answers
2






active

oldest

votes


















8














I have reformulated your questions a bit, for reasons that should
appear evident when you read them in sequence.



1. Is it possible to config linux filesystem use fixed character encoding to store file names regardless of LANG/LC_ALL environment?



No, this is not possible: as you mention in your question, a UNIX file
name is just a sequence of bytes; the kernel knows nothing about
the encoding, which entirely a user-space (i.e., application-level)
concept.



In other words, the kernel knows nothing about LANG/LC_*, so it cannot
translate.



2. Is it possible to let different file names refer to same file?



You can have multiple directory entries referring to the same file;
you can make that through hard links or symbolic links.



Be aware, however, that the file names that are not valid in the
current encoding (e.g., your GBK character string when you're working
in a UTF-8 locale) will display badly, if at all.



3. Is it possible to patch the kernel to translate character encoding between file-system and current environment?



You cannot patch the kernel to do this (see 1.), but you could -in
theory- patch the C library (e.g., glibc) to perform this translation,
and always convert file names to UTF-8 when it calls the kernel, and
convert them back to the current encoding when it reads a file name
from the kernel.



A simpler approach could be to write an overlay filesystem with FUSE,
that just redirects any filesystem request to another location after
converting the file name to/from UTF-8. Ideally you could mount this
filesystem in ~/trans, and when an access is made to
~/trans/a/GBK/encoded/path then the FUSE filesystem really accesses
/a/UTF-8/encoded/path.



However, the problem with these approaches is: what do you do with
files that already exist on your filesystem and are not UTF-8 encoded?
You cannot just simply pass them untranslated, because then you don't
know how to convert them; you cannot mangle them by translating
invalid character sequences to ? because that could create
conflicts...






share|improve this answer





















  • 4





    Such an overlay filesystem exists: Convmvfs.

    – Gilles
    Jun 22 '11 at 23:52



















1














What you can do is limit the amount of supported locales to only UTF-8 locales.



http://www.fifi.org/cgi-bin/man2html/usr/share/man/man5/locale.gen.5






share|improve this answer



















  • 2





    Personally, I wish there's only 1 charset encoding (UTF-8) in the world, but there're legacy application still running, and interoperability between Windows and Linux must be achieved, most people must face this nightmare.

    – LiuYan 刘研
    Jun 22 '11 at 18:38














Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f15419%2fseveral-questions-about-file-system-character-encoding-on-linux%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









8














I have reformulated your questions a bit, for reasons that should
appear evident when you read them in sequence.



1. Is it possible to config linux filesystem use fixed character encoding to store file names regardless of LANG/LC_ALL environment?



No, this is not possible: as you mention in your question, a UNIX file
name is just a sequence of bytes; the kernel knows nothing about
the encoding, which entirely a user-space (i.e., application-level)
concept.



In other words, the kernel knows nothing about LANG/LC_*, so it cannot
translate.



2. Is it possible to let different file names refer to same file?



You can have multiple directory entries referring to the same file;
you can make that through hard links or symbolic links.



Be aware, however, that the file names that are not valid in the
current encoding (e.g., your GBK character string when you're working
in a UTF-8 locale) will display badly, if at all.



3. Is it possible to patch the kernel to translate character encoding between file-system and current environment?



You cannot patch the kernel to do this (see 1.), but you could -in
theory- patch the C library (e.g., glibc) to perform this translation,
and always convert file names to UTF-8 when it calls the kernel, and
convert them back to the current encoding when it reads a file name
from the kernel.



A simpler approach could be to write an overlay filesystem with FUSE,
that just redirects any filesystem request to another location after
converting the file name to/from UTF-8. Ideally you could mount this
filesystem in ~/trans, and when an access is made to
~/trans/a/GBK/encoded/path then the FUSE filesystem really accesses
/a/UTF-8/encoded/path.



However, the problem with these approaches is: what do you do with
files that already exist on your filesystem and are not UTF-8 encoded?
You cannot just simply pass them untranslated, because then you don't
know how to convert them; you cannot mangle them by translating
invalid character sequences to ? because that could create
conflicts...






share|improve this answer





















  • 4





    Such an overlay filesystem exists: Convmvfs.

    – Gilles
    Jun 22 '11 at 23:52
















8














I have reformulated your questions a bit, for reasons that should
appear evident when you read them in sequence.



1. Is it possible to config linux filesystem use fixed character encoding to store file names regardless of LANG/LC_ALL environment?



No, this is not possible: as you mention in your question, a UNIX file
name is just a sequence of bytes; the kernel knows nothing about
the encoding, which entirely a user-space (i.e., application-level)
concept.



In other words, the kernel knows nothing about LANG/LC_*, so it cannot
translate.



2. Is it possible to let different file names refer to same file?



You can have multiple directory entries referring to the same file;
you can make that through hard links or symbolic links.



Be aware, however, that the file names that are not valid in the
current encoding (e.g., your GBK character string when you're working
in a UTF-8 locale) will display badly, if at all.



3. Is it possible to patch the kernel to translate character encoding between file-system and current environment?



You cannot patch the kernel to do this (see 1.), but you could -in
theory- patch the C library (e.g., glibc) to perform this translation,
and always convert file names to UTF-8 when it calls the kernel, and
convert them back to the current encoding when it reads a file name
from the kernel.



A simpler approach could be to write an overlay filesystem with FUSE,
that just redirects any filesystem request to another location after
converting the file name to/from UTF-8. Ideally you could mount this
filesystem in ~/trans, and when an access is made to
~/trans/a/GBK/encoded/path then the FUSE filesystem really accesses
/a/UTF-8/encoded/path.



However, the problem with these approaches is: what do you do with
files that already exist on your filesystem and are not UTF-8 encoded?
You cannot just simply pass them untranslated, because then you don't
know how to convert them; you cannot mangle them by translating
invalid character sequences to ? because that could create
conflicts...






share|improve this answer





















  • 4





    Such an overlay filesystem exists: Convmvfs.

    – Gilles
    Jun 22 '11 at 23:52














8












8








8







I have reformulated your questions a bit, for reasons that should
appear evident when you read them in sequence.



1. Is it possible to config linux filesystem use fixed character encoding to store file names regardless of LANG/LC_ALL environment?



No, this is not possible: as you mention in your question, a UNIX file
name is just a sequence of bytes; the kernel knows nothing about
the encoding, which entirely a user-space (i.e., application-level)
concept.



In other words, the kernel knows nothing about LANG/LC_*, so it cannot
translate.



2. Is it possible to let different file names refer to same file?



You can have multiple directory entries referring to the same file;
you can make that through hard links or symbolic links.



Be aware, however, that the file names that are not valid in the
current encoding (e.g., your GBK character string when you're working
in a UTF-8 locale) will display badly, if at all.



3. Is it possible to patch the kernel to translate character encoding between file-system and current environment?



You cannot patch the kernel to do this (see 1.), but you could -in
theory- patch the C library (e.g., glibc) to perform this translation,
and always convert file names to UTF-8 when it calls the kernel, and
convert them back to the current encoding when it reads a file name
from the kernel.



A simpler approach could be to write an overlay filesystem with FUSE,
that just redirects any filesystem request to another location after
converting the file name to/from UTF-8. Ideally you could mount this
filesystem in ~/trans, and when an access is made to
~/trans/a/GBK/encoded/path then the FUSE filesystem really accesses
/a/UTF-8/encoded/path.



However, the problem with these approaches is: what do you do with
files that already exist on your filesystem and are not UTF-8 encoded?
You cannot just simply pass them untranslated, because then you don't
know how to convert them; you cannot mangle them by translating
invalid character sequences to ? because that could create
conflicts...






share|improve this answer















I have reformulated your questions a bit, for reasons that should
appear evident when you read them in sequence.



1. Is it possible to config linux filesystem use fixed character encoding to store file names regardless of LANG/LC_ALL environment?



No, this is not possible: as you mention in your question, a UNIX file
name is just a sequence of bytes; the kernel knows nothing about
the encoding, which entirely a user-space (i.e., application-level)
concept.



In other words, the kernel knows nothing about LANG/LC_*, so it cannot
translate.



2. Is it possible to let different file names refer to same file?



You can have multiple directory entries referring to the same file;
you can make that through hard links or symbolic links.



Be aware, however, that the file names that are not valid in the
current encoding (e.g., your GBK character string when you're working
in a UTF-8 locale) will display badly, if at all.



3. Is it possible to patch the kernel to translate character encoding between file-system and current environment?



You cannot patch the kernel to do this (see 1.), but you could -in
theory- patch the C library (e.g., glibc) to perform this translation,
and always convert file names to UTF-8 when it calls the kernel, and
convert them back to the current encoding when it reads a file name
from the kernel.



A simpler approach could be to write an overlay filesystem with FUSE,
that just redirects any filesystem request to another location after
converting the file name to/from UTF-8. Ideally you could mount this
filesystem in ~/trans, and when an access is made to
~/trans/a/GBK/encoded/path then the FUSE filesystem really accesses
/a/UTF-8/encoded/path.



However, the problem with these approaches is: what do you do with
files that already exist on your filesystem and are not UTF-8 encoded?
You cannot just simply pass them untranslated, because then you don't
know how to convert them; you cannot mangle them by translating
invalid character sequences to ? because that could create
conflicts...







share|improve this answer














share|improve this answer



share|improve this answer








edited May 22 '12 at 22:53









Gilles

562k134 gold badges1158 silver badges1667 bronze badges




562k134 gold badges1158 silver badges1667 bronze badges










answered Jun 22 '11 at 12:03









Riccardo MurriRiccardo Murri

12.9k3 gold badges47 silver badges45 bronze badges




12.9k3 gold badges47 silver badges45 bronze badges








  • 4





    Such an overlay filesystem exists: Convmvfs.

    – Gilles
    Jun 22 '11 at 23:52














  • 4





    Such an overlay filesystem exists: Convmvfs.

    – Gilles
    Jun 22 '11 at 23:52








4




4





Such an overlay filesystem exists: Convmvfs.

– Gilles
Jun 22 '11 at 23:52





Such an overlay filesystem exists: Convmvfs.

– Gilles
Jun 22 '11 at 23:52













1














What you can do is limit the amount of supported locales to only UTF-8 locales.



http://www.fifi.org/cgi-bin/man2html/usr/share/man/man5/locale.gen.5






share|improve this answer



















  • 2





    Personally, I wish there's only 1 charset encoding (UTF-8) in the world, but there're legacy application still running, and interoperability between Windows and Linux must be achieved, most people must face this nightmare.

    – LiuYan 刘研
    Jun 22 '11 at 18:38
















1














What you can do is limit the amount of supported locales to only UTF-8 locales.



http://www.fifi.org/cgi-bin/man2html/usr/share/man/man5/locale.gen.5






share|improve this answer



















  • 2





    Personally, I wish there's only 1 charset encoding (UTF-8) in the world, but there're legacy application still running, and interoperability between Windows and Linux must be achieved, most people must face this nightmare.

    – LiuYan 刘研
    Jun 22 '11 at 18:38














1












1








1







What you can do is limit the amount of supported locales to only UTF-8 locales.



http://www.fifi.org/cgi-bin/man2html/usr/share/man/man5/locale.gen.5






share|improve this answer













What you can do is limit the amount of supported locales to only UTF-8 locales.



http://www.fifi.org/cgi-bin/man2html/usr/share/man/man5/locale.gen.5







share|improve this answer












share|improve this answer



share|improve this answer










answered Jun 22 '11 at 12:07









Let_Me_BeLet_Me_Be

4,6998 gold badges32 silver badges59 bronze badges




4,6998 gold badges32 silver badges59 bronze badges








  • 2





    Personally, I wish there's only 1 charset encoding (UTF-8) in the world, but there're legacy application still running, and interoperability between Windows and Linux must be achieved, most people must face this nightmare.

    – LiuYan 刘研
    Jun 22 '11 at 18:38














  • 2





    Personally, I wish there's only 1 charset encoding (UTF-8) in the world, but there're legacy application still running, and interoperability between Windows and Linux must be achieved, most people must face this nightmare.

    – LiuYan 刘研
    Jun 22 '11 at 18:38








2




2





Personally, I wish there's only 1 charset encoding (UTF-8) in the world, but there're legacy application still running, and interoperability between Windows and Linux must be achieved, most people must face this nightmare.

– LiuYan 刘研
Jun 22 '11 at 18:38





Personally, I wish there's only 1 charset encoding (UTF-8) in the world, but there're legacy application still running, and interoperability between Windows and Linux must be achieved, most people must face this nightmare.

– LiuYan 刘研
Jun 22 '11 at 18:38


















draft saved

draft discarded




















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f15419%2fseveral-questions-about-file-system-character-encoding-on-linux%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Hudson River Historic District Contents Geography History The district today Aesthetics Cultural...

The number designs the writing. Feandra Aversely Definition: The act of ingrafting a sprig or shoot of one...

Ayherre Geografie Demografie Externe links Navigatiemenu43° 23′ NB, 1° 15′ WL43° 23′ NB, 1°...