How to remove duplicate lines inside specifc tag in a XML file Announcing the arrival of...

How to react to hostile behavior from a senior developer?

Hangman Game with C++

How does the math work when buying airline miles?

Project Euler #1 in C++

Why wasn't DOSKEY integrated with COMMAND.COM?

Why is Nikon 1.4g better when Nikon 1.8g is sharper?

Chinese Seal on silk painting - what does it mean?

How to compare two different files line by line in unix?

How often does castling occur in grandmaster games?

Crossing US/Canada Border for less than 24 hours

Generate an RGB colour grid

Is it possible for SQL statements to execute concurrently within a single session in SQL Server?

Effects on objects due to a brief relocation of massive amounts of mass

As a beginner, should I get a Squier Strat with a SSS config or a HSS?

Do any jurisdictions seriously consider reclassifying social media websites as publishers?

How fail-safe is nr as stop bytes?

Sum letters are not two different

How come Sam didn't become Lord of Horn Hill?

What do you call the main part of a joke?

When a candle burns, why does the top of wick glow if bottom of flame is hottest?

Should I use a zero-interest credit card for a large one-time purchase?

How were pictures turned from film to a big picture in a picture frame before digital scanning?

Disembodied hand growing fangs

ArcGIS Pro Python arcpy.CreatePersonalGDB_management



How to remove duplicate lines inside specifc tag in a XML file



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Community Moderator Election Results
Why I closed the “Why is Kali so hard” questionHow to execute a function in the .profile fileHow to remove nodes from a HUGE (>2gb) XML file?How to remove duplicate files using bashHow to change values in XML fileSplit lines inside files with fixed width columnsAdding Numerical Suffixes to Tag-Names to Distinguish XML ElementsTo remove a tag from a xml filebash remove duplicate lines from txt files in folderHow can i remove duplicate files that contain 2 matching strings but keep the rest?adding comment tag on XML file through unix





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







1















Suppose you have the following file:



...
<tag2>
a
b
c
a
</tag2>
...
<tag2>
x
y
y
z
x
</tag2>


How to remove the duplicate lines inside each <tag2> like the example below?



...
<tag2>
a
b
c
</tag2>
...
<tag2>
x
y
z
</tag2>


I would like to search every file in the directory and sub directories and removed these duplicates.










share|improve this question







New contributor




orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Does the file have that exact layout, consistently? Is there any structure to the a, b, etc? Can you use a real XML parser here?

    – Michael Homer
    5 hours ago













  • @MichaelHomer not really, the number of any tags are unknown. And the order and number of <tag2> tags are also uknown.

    – orangesky
    5 hours ago


















1















Suppose you have the following file:



...
<tag2>
a
b
c
a
</tag2>
...
<tag2>
x
y
y
z
x
</tag2>


How to remove the duplicate lines inside each <tag2> like the example below?



...
<tag2>
a
b
c
</tag2>
...
<tag2>
x
y
z
</tag2>


I would like to search every file in the directory and sub directories and removed these duplicates.










share|improve this question







New contributor




orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Does the file have that exact layout, consistently? Is there any structure to the a, b, etc? Can you use a real XML parser here?

    – Michael Homer
    5 hours ago













  • @MichaelHomer not really, the number of any tags are unknown. And the order and number of <tag2> tags are also uknown.

    – orangesky
    5 hours ago














1












1








1








Suppose you have the following file:



...
<tag2>
a
b
c
a
</tag2>
...
<tag2>
x
y
y
z
x
</tag2>


How to remove the duplicate lines inside each <tag2> like the example below?



...
<tag2>
a
b
c
</tag2>
...
<tag2>
x
y
z
</tag2>


I would like to search every file in the directory and sub directories and removed these duplicates.










share|improve this question







New contributor




orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












Suppose you have the following file:



...
<tag2>
a
b
c
a
</tag2>
...
<tag2>
x
y
y
z
x
</tag2>


How to remove the duplicate lines inside each <tag2> like the example below?



...
<tag2>
a
b
c
</tag2>
...
<tag2>
x
y
z
</tag2>


I would like to search every file in the directory and sub directories and removed these duplicates.







bash text-formatting xml






share|improve this question







New contributor




orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 5 hours ago









orangeskyorangesky

61




61




New contributor




orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






orangesky is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • Does the file have that exact layout, consistently? Is there any structure to the a, b, etc? Can you use a real XML parser here?

    – Michael Homer
    5 hours ago













  • @MichaelHomer not really, the number of any tags are unknown. And the order and number of <tag2> tags are also uknown.

    – orangesky
    5 hours ago



















  • Does the file have that exact layout, consistently? Is there any structure to the a, b, etc? Can you use a real XML parser here?

    – Michael Homer
    5 hours ago













  • @MichaelHomer not really, the number of any tags are unknown. And the order and number of <tag2> tags are also uknown.

    – orangesky
    5 hours ago

















Does the file have that exact layout, consistently? Is there any structure to the a, b, etc? Can you use a real XML parser here?

– Michael Homer
5 hours ago







Does the file have that exact layout, consistently? Is there any structure to the a, b, etc? Can you use a real XML parser here?

– Michael Homer
5 hours ago















@MichaelHomer not really, the number of any tags are unknown. And the order and number of <tag2> tags are also uknown.

– orangesky
5 hours ago





@MichaelHomer not really, the number of any tags are unknown. And the order and number of <tag2> tags are also uknown.

– orangesky
5 hours ago










1 Answer
1






active

oldest

votes


















0














Unsure how complex your file could be, but for the example given this appears to work.



$ awk '/^<[a-z]/{print;delete z}!/^</{z[$0]=1}/^<//{for(x in z){print x}print}' file1
<tag2>
a
b
c
</tag2>
<tag2>
x
y
z
</tag2>
$


Commented version



awk '/^<[a-z]/ {         # If start tag
print # Print line
delete z # Clear array
} !/^</ { # If not a tag
z[$0]=1 # Store line
} /^<// { # If end tag
for(x in z) { # For each array entry
print x # Print array entry
}
print # Print end tag
}' file1





share|improve this answer


























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    orangesky is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f513292%2fhow-to-remove-duplicate-lines-inside-specifc-tag-in-a-xml-file%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Unsure how complex your file could be, but for the example given this appears to work.



    $ awk '/^<[a-z]/{print;delete z}!/^</{z[$0]=1}/^<//{for(x in z){print x}print}' file1
    <tag2>
    a
    b
    c
    </tag2>
    <tag2>
    x
    y
    z
    </tag2>
    $


    Commented version



    awk '/^<[a-z]/ {         # If start tag
    print # Print line
    delete z # Clear array
    } !/^</ { # If not a tag
    z[$0]=1 # Store line
    } /^<// { # If end tag
    for(x in z) { # For each array entry
    print x # Print array entry
    }
    print # Print end tag
    }' file1





    share|improve this answer






























      0














      Unsure how complex your file could be, but for the example given this appears to work.



      $ awk '/^<[a-z]/{print;delete z}!/^</{z[$0]=1}/^<//{for(x in z){print x}print}' file1
      <tag2>
      a
      b
      c
      </tag2>
      <tag2>
      x
      y
      z
      </tag2>
      $


      Commented version



      awk '/^<[a-z]/ {         # If start tag
      print # Print line
      delete z # Clear array
      } !/^</ { # If not a tag
      z[$0]=1 # Store line
      } /^<// { # If end tag
      for(x in z) { # For each array entry
      print x # Print array entry
      }
      print # Print end tag
      }' file1





      share|improve this answer




























        0












        0








        0







        Unsure how complex your file could be, but for the example given this appears to work.



        $ awk '/^<[a-z]/{print;delete z}!/^</{z[$0]=1}/^<//{for(x in z){print x}print}' file1
        <tag2>
        a
        b
        c
        </tag2>
        <tag2>
        x
        y
        z
        </tag2>
        $


        Commented version



        awk '/^<[a-z]/ {         # If start tag
        print # Print line
        delete z # Clear array
        } !/^</ { # If not a tag
        z[$0]=1 # Store line
        } /^<// { # If end tag
        for(x in z) { # For each array entry
        print x # Print array entry
        }
        print # Print end tag
        }' file1





        share|improve this answer















        Unsure how complex your file could be, but for the example given this appears to work.



        $ awk '/^<[a-z]/{print;delete z}!/^</{z[$0]=1}/^<//{for(x in z){print x}print}' file1
        <tag2>
        a
        b
        c
        </tag2>
        <tag2>
        x
        y
        z
        </tag2>
        $


        Commented version



        awk '/^<[a-z]/ {         # If start tag
        print # Print line
        delete z # Clear array
        } !/^</ { # If not a tag
        z[$0]=1 # Store line
        } /^<// { # If end tag
        for(x in z) { # For each array entry
        print x # Print array entry
        }
        print # Print end tag
        }' file1






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 4 hours ago

























        answered 4 hours ago









        stevesteve

        14.4k22653




        14.4k22653






















            orangesky is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            orangesky is a new contributor. Be nice, and check out our Code of Conduct.













            orangesky is a new contributor. Be nice, and check out our Code of Conduct.












            orangesky is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f513292%2fhow-to-remove-duplicate-lines-inside-specifc-tag-in-a-xml-file%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

            Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

            Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...