Is there an alternative to sed that supports unicode?How can I convert Persian numerals in UTF-8 to European...

How do lasers measure short distances (<1cm) when electronics are too slow for time-of-flight to work?

How to work with ElasticSearch in Mathematica?

Can a Creature at 0 HP Take Damage?

Difference between $HOME and ~

Are there any privately owned large commercial airports?

Is It normal to keep log file larger than data file?

How to find an internship in OR/Optimization?

How to find out which object is taking space?

Why did a young George Washington sign a document admitting to assassinating a French military officer?

Mishna Berura Ruling on Tying Tekhelet

Can I perform Umrah while on a Saudi Arabian visit e-visa

Chances of successful landing on the moon

How could "aggressor" pilots fly foreign aircraft without speaking the language?

Transiting through Switzerland by coach with lots of cash

Modern warfare theory in a medieval setting

Why does unique_ptr<Derived> implicitly cast to unique_ptr<Base>?

I am confused with the word order when putting a sentence into passé composé with reflexive verbs

Should a grammatical article be a part of a web link anchor

Why didn't Kes send Voyager home?

How to make "acts of patience" exciting?

Does the Creighton Method of Natural Family Planning have a failure rate of 3.2% or less?

What is the good path to become a Judo teacher?

Proving roots of a function cannot all be real

Can I color text by using an image, so that the color isn't flat?



Is there an alternative to sed that supports unicode?


How can I convert Persian numerals in UTF-8 to European numerals in ASCII?Why the inconsistency with using cat vs. echo piped to this sed command?sed command to print all line starting from and end to specific words present in a fileStrange ascii from hexdump of text fileUse sed to replace a part of a line with a variablePrint one byte signed number with hexdumpHow to get Hexdump output in same format as hexedit?Sed to replace lowercase and capital stringsecho line with var that contains few linesReplace AWORD or BWORD with CWORD in sed






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{
margin-bottom:0;
}








31















For example:



sed 's/u0091//g' file1


Right now, I have to do hexdump to get hex number and put into sed as follows:



$ echo -ne 'u9991' | hexdump -C
00000000 e9 a6 91 |...|
00000003


And then:



$ sed 's/xe9xa6x91//g' file1









share|improve this question

































    31















    For example:



    sed 's/u0091//g' file1


    Right now, I have to do hexdump to get hex number and put into sed as follows:



    $ echo -ne 'u9991' | hexdump -C
    00000000 e9 a6 91 |...|
    00000003


    And then:



    $ sed 's/xe9xa6x91//g' file1









    share|improve this question





























      31












      31








      31


      12






      For example:



      sed 's/u0091//g' file1


      Right now, I have to do hexdump to get hex number and put into sed as follows:



      $ echo -ne 'u9991' | hexdump -C
      00000000 e9 a6 91 |...|
      00000003


      And then:



      $ sed 's/xe9xa6x91//g' file1









      share|improve this question
















      For example:



      sed 's/u0091//g' file1


      Right now, I have to do hexdump to get hex number and put into sed as follows:



      $ echo -ne 'u9991' | hexdump -C
      00000000 e9 a6 91 |...|
      00000003


      And then:



      $ sed 's/xe9xa6x91//g' file1






      sed unicode hexdump






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Apr 17 '15 at 18:03









      chaos

      37.7k9 gold badges85 silver badges123 bronze badges




      37.7k9 gold badges85 silver badges123 bronze badges










      asked Apr 17 '15 at 8:38









      A-letubbyA-letubby

      3092 gold badges4 silver badges6 bronze badges




      3092 gold badges4 silver badges6 bronze badges

























          6 Answers
          6






          active

          oldest

          votes


















          26
















          Just use that syntax:



          sed 's/馑//g' file1


          Or in the escaped form:



          sed "s/$(echo -ne 'u9991')//g" file1


          (Note that older versions of Bash and some shells do not understand echo -e 'u9991', so check first.)






          share|improve this answer























          • 1





            Does sed count 馑 as one character or 3? That is, does echo 馑 | sed s/...// print anything?

            – immibis
            Apr 17 '15 at 11:22













          • @immibis Since sed has the g modifier it replaces all occurence also when they follow each other. Also sed should count it as one character, see: echo -ne "馑" | wc -m gives 1. If you count the bytes (wc -c) it would return 3. Did I understand your question correctly?

            – chaos
            Apr 17 '15 at 11:28













          • I meant: does . mean "one character" or "one byte"?

            – immibis
            Apr 17 '15 at 11:30











          • @immibis I matches one character hence echo 馑 | sed s/...// gives me (nothing is replaced)

            – chaos
            Apr 17 '15 at 11:33








          • 4





            @chaos: It works under en_US.UTF-8, but doesn't under C.

            – choroba
            Apr 17 '15 at 12:28



















          15
















          Perl can do that:



          echo 汉典“馑”字的基本解释 | perl -CS -pe 's/N{U+9991}/Jin/g'


          -CS turns on UTF-8 for standard input, output and error.






          share|improve this answer





















          • 7





            Perl can do almost anything.....

            – wobbily_col
            Apr 17 '15 at 10:49



















          6
















          A number of versions of sed support Unicode:





          • Heirloom sed, which is based on "original Unix material".


          • GNU sed, which is its own codebase.


          • Plan 9 sed, which has been ported to Unix-like operating systems.


          I couldn't find information on BSD sed, which I thought was strange, but I think the odds are good that it supports Unicode too. Unfortunately, there is no standard way to tell sed which encoding to use, so each one does this in its own ways.






          share|improve this answer


























          • Do they support UTF-16 with and without BOM ?

            – Bon Ami
            Apr 17 '15 at 17:12






          • 10





            UTF-16 is pretty unusable in Unix-based OSes. It's also an abomination that should have never seen the light of day.

            – Brian Bi
            Apr 17 '15 at 19:11











          • Whether or not they support UTF-16 depends on the implementation, and I'm afraid I don't have that data. I doubt that Plan 9 sed does (the original OS is UTF-8 everywhere), but I can't be sure, and even if it doesn't, the others might.

            – The Spooniest
            Apr 17 '15 at 19:30



















          2
















          This works for me:



          $ vim -nEs +'%s/%u9991//g' +wq file1


          It’s a drop more verbose than I’d like; here’s a full explanation:





          • -n disable vim swap file


          • -E Ex improved mode


          • -s silent mode


          • +'%s/%u9991//g' execute the substitution command


          • +wq save and exit






          share|improve this answer


























          • I suppose this modifies file1 in-place, is that correct?

            – gerrit
            Jan 10 at 10:32











          • @gerrit that’s correct, and thanks for pointing it out.

            – Aryeh Leib Taurog
            Jan 10 at 19:21



















          0
















          Works for me with GNU sed (version 4.2.1):



          $ echo -ne $'u9991' | sed 's/xe9xa6x91//g' | hexdump -C
          $ echo -ne $'u9991' | hexdump -C
          00000000 e9 a6 91


          (As another replacement for sed you could also use GNU awk; but it don't seem necessary.)






          share|improve this answer

































            0
















            With recent versions of BASH, just omit the quotes around the sed expression and you can use BASH's escaped strings. Spaces within the sed expression or parts of the sed expression that might be interpreted by BASH as wildcards can be individually quoted.



            $ echo "饥馑荐臻" | sed s/$'u9991'//g
            饥荐臻





            share|improve this answer



























              Your Answer








              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "106"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });















              draft saved

              draft discarded
















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f196780%2fis-there-an-alternative-to-sed-that-supports-unicode%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              6 Answers
              6






              active

              oldest

              votes








              6 Answers
              6






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              26
















              Just use that syntax:



              sed 's/馑//g' file1


              Or in the escaped form:



              sed "s/$(echo -ne 'u9991')//g" file1


              (Note that older versions of Bash and some shells do not understand echo -e 'u9991', so check first.)






              share|improve this answer























              • 1





                Does sed count 馑 as one character or 3? That is, does echo 馑 | sed s/...// print anything?

                – immibis
                Apr 17 '15 at 11:22













              • @immibis Since sed has the g modifier it replaces all occurence also when they follow each other. Also sed should count it as one character, see: echo -ne "馑" | wc -m gives 1. If you count the bytes (wc -c) it would return 3. Did I understand your question correctly?

                – chaos
                Apr 17 '15 at 11:28













              • I meant: does . mean "one character" or "one byte"?

                – immibis
                Apr 17 '15 at 11:30











              • @immibis I matches one character hence echo 馑 | sed s/...// gives me (nothing is replaced)

                – chaos
                Apr 17 '15 at 11:33








              • 4





                @chaos: It works under en_US.UTF-8, but doesn't under C.

                – choroba
                Apr 17 '15 at 12:28
















              26
















              Just use that syntax:



              sed 's/馑//g' file1


              Or in the escaped form:



              sed "s/$(echo -ne 'u9991')//g" file1


              (Note that older versions of Bash and some shells do not understand echo -e 'u9991', so check first.)






              share|improve this answer























              • 1





                Does sed count 馑 as one character or 3? That is, does echo 馑 | sed s/...// print anything?

                – immibis
                Apr 17 '15 at 11:22













              • @immibis Since sed has the g modifier it replaces all occurence also when they follow each other. Also sed should count it as one character, see: echo -ne "馑" | wc -m gives 1. If you count the bytes (wc -c) it would return 3. Did I understand your question correctly?

                – chaos
                Apr 17 '15 at 11:28













              • I meant: does . mean "one character" or "one byte"?

                – immibis
                Apr 17 '15 at 11:30











              • @immibis I matches one character hence echo 馑 | sed s/...// gives me (nothing is replaced)

                – chaos
                Apr 17 '15 at 11:33








              • 4





                @chaos: It works under en_US.UTF-8, but doesn't under C.

                – choroba
                Apr 17 '15 at 12:28














              26














              26










              26









              Just use that syntax:



              sed 's/馑//g' file1


              Or in the escaped form:



              sed "s/$(echo -ne 'u9991')//g" file1


              (Note that older versions of Bash and some shells do not understand echo -e 'u9991', so check first.)






              share|improve this answer















              Just use that syntax:



              sed 's/馑//g' file1


              Or in the escaped form:



              sed "s/$(echo -ne 'u9991')//g" file1


              (Note that older versions of Bash and some shells do not understand echo -e 'u9991', so check first.)







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Oct 17 '16 at 16:52









              Flimm

              1,6204 gold badges20 silver badges28 bronze badges




              1,6204 gold badges20 silver badges28 bronze badges










              answered Apr 17 '15 at 8:46









              chaoschaos

              37.7k9 gold badges85 silver badges123 bronze badges




              37.7k9 gold badges85 silver badges123 bronze badges











              • 1





                Does sed count 馑 as one character or 3? That is, does echo 馑 | sed s/...// print anything?

                – immibis
                Apr 17 '15 at 11:22













              • @immibis Since sed has the g modifier it replaces all occurence also when they follow each other. Also sed should count it as one character, see: echo -ne "馑" | wc -m gives 1. If you count the bytes (wc -c) it would return 3. Did I understand your question correctly?

                – chaos
                Apr 17 '15 at 11:28













              • I meant: does . mean "one character" or "one byte"?

                – immibis
                Apr 17 '15 at 11:30











              • @immibis I matches one character hence echo 馑 | sed s/...// gives me (nothing is replaced)

                – chaos
                Apr 17 '15 at 11:33








              • 4





                @chaos: It works under en_US.UTF-8, but doesn't under C.

                – choroba
                Apr 17 '15 at 12:28














              • 1





                Does sed count 馑 as one character or 3? That is, does echo 馑 | sed s/...// print anything?

                – immibis
                Apr 17 '15 at 11:22













              • @immibis Since sed has the g modifier it replaces all occurence also when they follow each other. Also sed should count it as one character, see: echo -ne "馑" | wc -m gives 1. If you count the bytes (wc -c) it would return 3. Did I understand your question correctly?

                – chaos
                Apr 17 '15 at 11:28













              • I meant: does . mean "one character" or "one byte"?

                – immibis
                Apr 17 '15 at 11:30











              • @immibis I matches one character hence echo 馑 | sed s/...// gives me (nothing is replaced)

                – chaos
                Apr 17 '15 at 11:33








              • 4





                @chaos: It works under en_US.UTF-8, but doesn't under C.

                – choroba
                Apr 17 '15 at 12:28








              1




              1





              Does sed count 馑 as one character or 3? That is, does echo 馑 | sed s/...// print anything?

              – immibis
              Apr 17 '15 at 11:22







              Does sed count 馑 as one character or 3? That is, does echo 馑 | sed s/...// print anything?

              – immibis
              Apr 17 '15 at 11:22















              @immibis Since sed has the g modifier it replaces all occurence also when they follow each other. Also sed should count it as one character, see: echo -ne "馑" | wc -m gives 1. If you count the bytes (wc -c) it would return 3. Did I understand your question correctly?

              – chaos
              Apr 17 '15 at 11:28







              @immibis Since sed has the g modifier it replaces all occurence also when they follow each other. Also sed should count it as one character, see: echo -ne "馑" | wc -m gives 1. If you count the bytes (wc -c) it would return 3. Did I understand your question correctly?

              – chaos
              Apr 17 '15 at 11:28















              I meant: does . mean "one character" or "one byte"?

              – immibis
              Apr 17 '15 at 11:30





              I meant: does . mean "one character" or "one byte"?

              – immibis
              Apr 17 '15 at 11:30













              @immibis I matches one character hence echo 馑 | sed s/...// gives me (nothing is replaced)

              – chaos
              Apr 17 '15 at 11:33







              @immibis I matches one character hence echo 馑 | sed s/...// gives me (nothing is replaced)

              – chaos
              Apr 17 '15 at 11:33






              4




              4





              @chaos: It works under en_US.UTF-8, but doesn't under C.

              – choroba
              Apr 17 '15 at 12:28





              @chaos: It works under en_US.UTF-8, but doesn't under C.

              – choroba
              Apr 17 '15 at 12:28













              15
















              Perl can do that:



              echo 汉典“馑”字的基本解释 | perl -CS -pe 's/N{U+9991}/Jin/g'


              -CS turns on UTF-8 for standard input, output and error.






              share|improve this answer





















              • 7





                Perl can do almost anything.....

                – wobbily_col
                Apr 17 '15 at 10:49
















              15
















              Perl can do that:



              echo 汉典“馑”字的基本解释 | perl -CS -pe 's/N{U+9991}/Jin/g'


              -CS turns on UTF-8 for standard input, output and error.






              share|improve this answer





















              • 7





                Perl can do almost anything.....

                – wobbily_col
                Apr 17 '15 at 10:49














              15














              15










              15









              Perl can do that:



              echo 汉典“馑”字的基本解释 | perl -CS -pe 's/N{U+9991}/Jin/g'


              -CS turns on UTF-8 for standard input, output and error.






              share|improve this answer













              Perl can do that:



              echo 汉典“馑”字的基本解释 | perl -CS -pe 's/N{U+9991}/Jin/g'


              -CS turns on UTF-8 for standard input, output and error.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Apr 17 '15 at 8:50









              chorobachoroba

              29.6k4 gold badges57 silver badges81 bronze badges




              29.6k4 gold badges57 silver badges81 bronze badges











              • 7





                Perl can do almost anything.....

                – wobbily_col
                Apr 17 '15 at 10:49














              • 7





                Perl can do almost anything.....

                – wobbily_col
                Apr 17 '15 at 10:49








              7




              7





              Perl can do almost anything.....

              – wobbily_col
              Apr 17 '15 at 10:49





              Perl can do almost anything.....

              – wobbily_col
              Apr 17 '15 at 10:49











              6
















              A number of versions of sed support Unicode:





              • Heirloom sed, which is based on "original Unix material".


              • GNU sed, which is its own codebase.


              • Plan 9 sed, which has been ported to Unix-like operating systems.


              I couldn't find information on BSD sed, which I thought was strange, but I think the odds are good that it supports Unicode too. Unfortunately, there is no standard way to tell sed which encoding to use, so each one does this in its own ways.






              share|improve this answer


























              • Do they support UTF-16 with and without BOM ?

                – Bon Ami
                Apr 17 '15 at 17:12






              • 10





                UTF-16 is pretty unusable in Unix-based OSes. It's also an abomination that should have never seen the light of day.

                – Brian Bi
                Apr 17 '15 at 19:11











              • Whether or not they support UTF-16 depends on the implementation, and I'm afraid I don't have that data. I doubt that Plan 9 sed does (the original OS is UTF-8 everywhere), but I can't be sure, and even if it doesn't, the others might.

                – The Spooniest
                Apr 17 '15 at 19:30
















              6
















              A number of versions of sed support Unicode:





              • Heirloom sed, which is based on "original Unix material".


              • GNU sed, which is its own codebase.


              • Plan 9 sed, which has been ported to Unix-like operating systems.


              I couldn't find information on BSD sed, which I thought was strange, but I think the odds are good that it supports Unicode too. Unfortunately, there is no standard way to tell sed which encoding to use, so each one does this in its own ways.






              share|improve this answer


























              • Do they support UTF-16 with and without BOM ?

                – Bon Ami
                Apr 17 '15 at 17:12






              • 10





                UTF-16 is pretty unusable in Unix-based OSes. It's also an abomination that should have never seen the light of day.

                – Brian Bi
                Apr 17 '15 at 19:11











              • Whether or not they support UTF-16 depends on the implementation, and I'm afraid I don't have that data. I doubt that Plan 9 sed does (the original OS is UTF-8 everywhere), but I can't be sure, and even if it doesn't, the others might.

                – The Spooniest
                Apr 17 '15 at 19:30














              6














              6










              6









              A number of versions of sed support Unicode:





              • Heirloom sed, which is based on "original Unix material".


              • GNU sed, which is its own codebase.


              • Plan 9 sed, which has been ported to Unix-like operating systems.


              I couldn't find information on BSD sed, which I thought was strange, but I think the odds are good that it supports Unicode too. Unfortunately, there is no standard way to tell sed which encoding to use, so each one does this in its own ways.






              share|improve this answer













              A number of versions of sed support Unicode:





              • Heirloom sed, which is based on "original Unix material".


              • GNU sed, which is its own codebase.


              • Plan 9 sed, which has been ported to Unix-like operating systems.


              I couldn't find information on BSD sed, which I thought was strange, but I think the odds are good that it supports Unicode too. Unfortunately, there is no standard way to tell sed which encoding to use, so each one does this in its own ways.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Apr 17 '15 at 12:54









              The SpooniestThe Spooniest

              2811 silver badge1 bronze badge




              2811 silver badge1 bronze badge
















              • Do they support UTF-16 with and without BOM ?

                – Bon Ami
                Apr 17 '15 at 17:12






              • 10





                UTF-16 is pretty unusable in Unix-based OSes. It's also an abomination that should have never seen the light of day.

                – Brian Bi
                Apr 17 '15 at 19:11











              • Whether or not they support UTF-16 depends on the implementation, and I'm afraid I don't have that data. I doubt that Plan 9 sed does (the original OS is UTF-8 everywhere), but I can't be sure, and even if it doesn't, the others might.

                – The Spooniest
                Apr 17 '15 at 19:30



















              • Do they support UTF-16 with and without BOM ?

                – Bon Ami
                Apr 17 '15 at 17:12






              • 10





                UTF-16 is pretty unusable in Unix-based OSes. It's also an abomination that should have never seen the light of day.

                – Brian Bi
                Apr 17 '15 at 19:11











              • Whether or not they support UTF-16 depends on the implementation, and I'm afraid I don't have that data. I doubt that Plan 9 sed does (the original OS is UTF-8 everywhere), but I can't be sure, and even if it doesn't, the others might.

                – The Spooniest
                Apr 17 '15 at 19:30

















              Do they support UTF-16 with and without BOM ?

              – Bon Ami
              Apr 17 '15 at 17:12





              Do they support UTF-16 with and without BOM ?

              – Bon Ami
              Apr 17 '15 at 17:12




              10




              10





              UTF-16 is pretty unusable in Unix-based OSes. It's also an abomination that should have never seen the light of day.

              – Brian Bi
              Apr 17 '15 at 19:11





              UTF-16 is pretty unusable in Unix-based OSes. It's also an abomination that should have never seen the light of day.

              – Brian Bi
              Apr 17 '15 at 19:11













              Whether or not they support UTF-16 depends on the implementation, and I'm afraid I don't have that data. I doubt that Plan 9 sed does (the original OS is UTF-8 everywhere), but I can't be sure, and even if it doesn't, the others might.

              – The Spooniest
              Apr 17 '15 at 19:30





              Whether or not they support UTF-16 depends on the implementation, and I'm afraid I don't have that data. I doubt that Plan 9 sed does (the original OS is UTF-8 everywhere), but I can't be sure, and even if it doesn't, the others might.

              – The Spooniest
              Apr 17 '15 at 19:30











              2
















              This works for me:



              $ vim -nEs +'%s/%u9991//g' +wq file1


              It’s a drop more verbose than I’d like; here’s a full explanation:





              • -n disable vim swap file


              • -E Ex improved mode


              • -s silent mode


              • +'%s/%u9991//g' execute the substitution command


              • +wq save and exit






              share|improve this answer


























              • I suppose this modifies file1 in-place, is that correct?

                – gerrit
                Jan 10 at 10:32











              • @gerrit that’s correct, and thanks for pointing it out.

                – Aryeh Leib Taurog
                Jan 10 at 19:21
















              2
















              This works for me:



              $ vim -nEs +'%s/%u9991//g' +wq file1


              It’s a drop more verbose than I’d like; here’s a full explanation:





              • -n disable vim swap file


              • -E Ex improved mode


              • -s silent mode


              • +'%s/%u9991//g' execute the substitution command


              • +wq save and exit






              share|improve this answer


























              • I suppose this modifies file1 in-place, is that correct?

                – gerrit
                Jan 10 at 10:32











              • @gerrit that’s correct, and thanks for pointing it out.

                – Aryeh Leib Taurog
                Jan 10 at 19:21














              2














              2










              2









              This works for me:



              $ vim -nEs +'%s/%u9991//g' +wq file1


              It’s a drop more verbose than I’d like; here’s a full explanation:





              • -n disable vim swap file


              • -E Ex improved mode


              • -s silent mode


              • +'%s/%u9991//g' execute the substitution command


              • +wq save and exit






              share|improve this answer













              This works for me:



              $ vim -nEs +'%s/%u9991//g' +wq file1


              It’s a drop more verbose than I’d like; here’s a full explanation:





              • -n disable vim swap file


              • -E Ex improved mode


              • -s silent mode


              • +'%s/%u9991//g' execute the substitution command


              • +wq save and exit







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Apr 17 '18 at 18:21









              Aryeh Leib TaurogAryeh Leib Taurog

              4234 silver badges8 bronze badges




              4234 silver badges8 bronze badges
















              • I suppose this modifies file1 in-place, is that correct?

                – gerrit
                Jan 10 at 10:32











              • @gerrit that’s correct, and thanks for pointing it out.

                – Aryeh Leib Taurog
                Jan 10 at 19:21



















              • I suppose this modifies file1 in-place, is that correct?

                – gerrit
                Jan 10 at 10:32











              • @gerrit that’s correct, and thanks for pointing it out.

                – Aryeh Leib Taurog
                Jan 10 at 19:21

















              I suppose this modifies file1 in-place, is that correct?

              – gerrit
              Jan 10 at 10:32





              I suppose this modifies file1 in-place, is that correct?

              – gerrit
              Jan 10 at 10:32













              @gerrit that’s correct, and thanks for pointing it out.

              – Aryeh Leib Taurog
              Jan 10 at 19:21





              @gerrit that’s correct, and thanks for pointing it out.

              – Aryeh Leib Taurog
              Jan 10 at 19:21











              0
















              Works for me with GNU sed (version 4.2.1):



              $ echo -ne $'u9991' | sed 's/xe9xa6x91//g' | hexdump -C
              $ echo -ne $'u9991' | hexdump -C
              00000000 e9 a6 91


              (As another replacement for sed you could also use GNU awk; but it don't seem necessary.)






              share|improve this answer






























                0
















                Works for me with GNU sed (version 4.2.1):



                $ echo -ne $'u9991' | sed 's/xe9xa6x91//g' | hexdump -C
                $ echo -ne $'u9991' | hexdump -C
                00000000 e9 a6 91


                (As another replacement for sed you could also use GNU awk; but it don't seem necessary.)






                share|improve this answer




























                  0














                  0










                  0









                  Works for me with GNU sed (version 4.2.1):



                  $ echo -ne $'u9991' | sed 's/xe9xa6x91//g' | hexdump -C
                  $ echo -ne $'u9991' | hexdump -C
                  00000000 e9 a6 91


                  (As another replacement for sed you could also use GNU awk; but it don't seem necessary.)






                  share|improve this answer













                  Works for me with GNU sed (version 4.2.1):



                  $ echo -ne $'u9991' | sed 's/xe9xa6x91//g' | hexdump -C
                  $ echo -ne $'u9991' | hexdump -C
                  00000000 e9 a6 91


                  (As another replacement for sed you could also use GNU awk; but it don't seem necessary.)







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Apr 17 '15 at 10:16









                  JanisJanis

                  10.7k2 gold badges17 silver badges39 bronze badges




                  10.7k2 gold badges17 silver badges39 bronze badges


























                      0
















                      With recent versions of BASH, just omit the quotes around the sed expression and you can use BASH's escaped strings. Spaces within the sed expression or parts of the sed expression that might be interpreted by BASH as wildcards can be individually quoted.



                      $ echo "饥馑荐臻" | sed s/$'u9991'//g
                      饥荐臻





                      share|improve this answer






























                        0
















                        With recent versions of BASH, just omit the quotes around the sed expression and you can use BASH's escaped strings. Spaces within the sed expression or parts of the sed expression that might be interpreted by BASH as wildcards can be individually quoted.



                        $ echo "饥馑荐臻" | sed s/$'u9991'//g
                        饥荐臻





                        share|improve this answer




























                          0














                          0










                          0









                          With recent versions of BASH, just omit the quotes around the sed expression and you can use BASH's escaped strings. Spaces within the sed expression or parts of the sed expression that might be interpreted by BASH as wildcards can be individually quoted.



                          $ echo "饥馑荐臻" | sed s/$'u9991'//g
                          饥荐臻





                          share|improve this answer













                          With recent versions of BASH, just omit the quotes around the sed expression and you can use BASH's escaped strings. Spaces within the sed expression or parts of the sed expression that might be interpreted by BASH as wildcards can be individually quoted.



                          $ echo "饥馑荐臻" | sed s/$'u9991'//g
                          饥荐臻






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered 36 mins ago









                          Dave RoveDave Rove

                          2901 gold badge2 silver badges7 bronze badges




                          2901 gold badge2 silver badges7 bronze badges


































                              draft saved

                              draft discarded



















































                              Thanks for contributing an answer to Unix & Linux Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f196780%2fis-there-an-alternative-to-sed-that-supports-unicode%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

                              Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

                              Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...