extracting organism count with thier corrosponding ID?Text file look-up by columnMerge two lists while...

Can this planet in a binary star system exist?

Did the Apollo Guidance Computer really use 60% of the world's ICs in 1963?

In what language did Túrin converse with Mím?

Defending Castle from Zombies

Give Lightning Web Component a Prettier Name

Idiomatic way to create an immutable and efficient class in C++?

Ordering 2D Border Points

Is there an in-universe explanation given to the senior Imperial Navy Officers as to why Darth Vader serves Emperor Palpatine?

Get contents before a colon

Can I lend a small amount of my own money to a bank at the federal funds rate?

How can I throw a body?

Answer with an image of my favorite musician

is "prohibition against," a double negative?

Why is the Ellipsoid Method of polynomial complexity?

Spicing up a moment of peace

What caused the end of cybernetic implants?

Isometric Heyacrazy - Now In 3D!

Why are JWST optics not enclosed like HST?

Why do presidential pardons exist in a country having a clear separation of powers?

Why do IR remotes influence AM radios?

Does Dovescape counter Enchantment Creatures?

Was a six-engine 747 ever seriously considered by Boeing?

What's the difference between a variable and a memory location?

Is this position a forced win for Black after move 14?



extracting organism count with thier corrosponding ID?


Text file look-up by columnMerge two lists while removing duplicatesSelect lines from text file which have ids listed in another fileHow to get the unique count of a particular part of a stringstream editing tools: output what's scrapedExtract data from csvHow to Wget images from CSV, append url and filenames from fields?Using Uniq -c with a regular expression or counting the number of lines removedconditional extracting a columnCounting instances in a variable, then reformatting rows/columns within bash script






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







0















I have a file with many column like:



ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]


i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :



5   Papaver somniferum       ID1
ID2
ID3
ID4
ID5
4 Helianthus annuus ID6
ID7
ID8
ID9
1 Arabidopsis thaliana ID10
2 Citrus sinensis ID11
ID12
3 Nelumbo nucifera ID13
ID14
ID15
1 Helianthus annuus ID16
1 Zea mays ID17
1 Coffea eugenioides ID18
1 Musa acuminata ID19
1 Asparagus officinalis ID20


I have tried something :



cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr


which gives the output:



5   Papaver somniferum
4 Helianthus annuus
1 Arabidopsis thaliana
2 Citrus sinensis
3 Nelumbo nucifera
1 Helianthus annuus
1 Zea mays
1 Coffea eugenioides
1 Musa acuminata
1 Asparagus officinalis


Thankyou.










share|improve this question

































    0















    I have a file with many column like:



    ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
    ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
    ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
    ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
    ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
    ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
    ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
    ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
    ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
    ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
    ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
    ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
    ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
    ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
    ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
    ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
    ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
    ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
    ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
    ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]


    i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :



    5   Papaver somniferum       ID1
    ID2
    ID3
    ID4
    ID5
    4 Helianthus annuus ID6
    ID7
    ID8
    ID9
    1 Arabidopsis thaliana ID10
    2 Citrus sinensis ID11
    ID12
    3 Nelumbo nucifera ID13
    ID14
    ID15
    1 Helianthus annuus ID16
    1 Zea mays ID17
    1 Coffea eugenioides ID18
    1 Musa acuminata ID19
    1 Asparagus officinalis ID20


    I have tried something :



    cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr


    which gives the output:



    5   Papaver somniferum
    4 Helianthus annuus
    1 Arabidopsis thaliana
    2 Citrus sinensis
    3 Nelumbo nucifera
    1 Helianthus annuus
    1 Zea mays
    1 Coffea eugenioides
    1 Musa acuminata
    1 Asparagus officinalis


    Thankyou.










    share|improve this question





























      0












      0








      0








      I have a file with many column like:



      ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
      ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
      ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
      ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
      ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
      ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
      ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
      ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
      ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
      ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
      ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]


      i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :



      5   Papaver somniferum       ID1
      ID2
      ID3
      ID4
      ID5
      4 Helianthus annuus ID6
      ID7
      ID8
      ID9
      1 Arabidopsis thaliana ID10
      2 Citrus sinensis ID11
      ID12
      3 Nelumbo nucifera ID13
      ID14
      ID15
      1 Helianthus annuus ID16
      1 Zea mays ID17
      1 Coffea eugenioides ID18
      1 Musa acuminata ID19
      1 Asparagus officinalis ID20


      I have tried something :



      cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr


      which gives the output:



      5   Papaver somniferum
      4 Helianthus annuus
      1 Arabidopsis thaliana
      2 Citrus sinensis
      3 Nelumbo nucifera
      1 Helianthus annuus
      1 Zea mays
      1 Coffea eugenioides
      1 Musa acuminata
      1 Asparagus officinalis


      Thankyou.










      share|improve this question
















      I have a file with many column like:



      ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
      ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
      ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
      ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
      ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
      ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
      ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
      ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
      ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
      ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
      ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
      ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
      ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]


      i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :



      5   Papaver somniferum       ID1
      ID2
      ID3
      ID4
      ID5
      4 Helianthus annuus ID6
      ID7
      ID8
      ID9
      1 Arabidopsis thaliana ID10
      2 Citrus sinensis ID11
      ID12
      3 Nelumbo nucifera ID13
      ID14
      ID15
      1 Helianthus annuus ID16
      1 Zea mays ID17
      1 Coffea eugenioides ID18
      1 Musa acuminata ID19
      1 Asparagus officinalis ID20


      I have tried something :



      cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr


      which gives the output:



      5   Papaver somniferum
      4 Helianthus annuus
      1 Arabidopsis thaliana
      2 Citrus sinensis
      3 Nelumbo nucifera
      1 Helianthus annuus
      1 Zea mays
      1 Coffea eugenioides
      1 Musa acuminata
      1 Asparagus officinalis


      Thankyou.







      bash shell ubuntu bioinformatics






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 24 mins ago









      Kusalananda

      162k18 gold badges320 silver badges506 bronze badges




      162k18 gold badges320 silver badges506 bronze badges










      asked 1 hour ago









      Anjali ChaudharyAnjali Chaudhary

      153 bronze badges




      153 bronze badges

























          0






          active

          oldest

          votes














          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f538193%2fextracting-organism-count-with-thier-corrosponding-id%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f538193%2fextracting-organism-count-with-thier-corrosponding-id%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

          Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

          Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...