extracting organism count with thier corrosponding ID?Text file look-up by columnMerge two lists while...
Can this planet in a binary star system exist?
Did the Apollo Guidance Computer really use 60% of the world's ICs in 1963?
In what language did Túrin converse with Mím?
Defending Castle from Zombies
Give Lightning Web Component a Prettier Name
Idiomatic way to create an immutable and efficient class in C++?
Ordering 2D Border Points
Is there an in-universe explanation given to the senior Imperial Navy Officers as to why Darth Vader serves Emperor Palpatine?
Get contents before a colon
Can I lend a small amount of my own money to a bank at the federal funds rate?
How can I throw a body?
Answer with an image of my favorite musician
is "prohibition against," a double negative?
Why is the Ellipsoid Method of polynomial complexity?
Spicing up a moment of peace
What caused the end of cybernetic implants?
Isometric Heyacrazy - Now In 3D!
Why are JWST optics not enclosed like HST?
Why do presidential pardons exist in a country having a clear separation of powers?
Why do IR remotes influence AM radios?
Does Dovescape counter Enchantment Creatures?
Was a six-engine 747 ever seriously considered by Boeing?
What's the difference between a variable and a memory location?
Is this position a forced win for Black after move 14?
extracting organism count with thier corrosponding ID?
Text file look-up by columnMerge two lists while removing duplicatesSelect lines from text file which have ids listed in another fileHow to get the unique count of a particular part of a stringstream editing tools: output what's scrapedExtract data from csvHow to Wget images from CSV, append url and filenames from fields?Using Uniq -c with a regular expression or counting the number of lines removedconditional extracting a columnCounting instances in a variable, then reformatting rows/columns within bash script
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
I have a file with many column like:
ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]
i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :
5 Papaver somniferum ID1
ID2
ID3
ID4
ID5
4 Helianthus annuus ID6
ID7
ID8
ID9
1 Arabidopsis thaliana ID10
2 Citrus sinensis ID11
ID12
3 Nelumbo nucifera ID13
ID14
ID15
1 Helianthus annuus ID16
1 Zea mays ID17
1 Coffea eugenioides ID18
1 Musa acuminata ID19
1 Asparagus officinalis ID20
I have tried something :
cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr
which gives the output:
5 Papaver somniferum
4 Helianthus annuus
1 Arabidopsis thaliana
2 Citrus sinensis
3 Nelumbo nucifera
1 Helianthus annuus
1 Zea mays
1 Coffea eugenioides
1 Musa acuminata
1 Asparagus officinalis
Thankyou.
bash shell ubuntu bioinformatics
add a comment |
I have a file with many column like:
ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]
i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :
5 Papaver somniferum ID1
ID2
ID3
ID4
ID5
4 Helianthus annuus ID6
ID7
ID8
ID9
1 Arabidopsis thaliana ID10
2 Citrus sinensis ID11
ID12
3 Nelumbo nucifera ID13
ID14
ID15
1 Helianthus annuus ID16
1 Zea mays ID17
1 Coffea eugenioides ID18
1 Musa acuminata ID19
1 Asparagus officinalis ID20
I have tried something :
cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr
which gives the output:
5 Papaver somniferum
4 Helianthus annuus
1 Arabidopsis thaliana
2 Citrus sinensis
3 Nelumbo nucifera
1 Helianthus annuus
1 Zea mays
1 Coffea eugenioides
1 Musa acuminata
1 Asparagus officinalis
Thankyou.
bash shell ubuntu bioinformatics
add a comment |
I have a file with many column like:
ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]
i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :
5 Papaver somniferum ID1
ID2
ID3
ID4
ID5
4 Helianthus annuus ID6
ID7
ID8
ID9
1 Arabidopsis thaliana ID10
2 Citrus sinensis ID11
ID12
3 Nelumbo nucifera ID13
ID14
ID15
1 Helianthus annuus ID16
1 Zea mays ID17
1 Coffea eugenioides ID18
1 Musa acuminata ID19
1 Asparagus officinalis ID20
I have tried something :
cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr
which gives the output:
5 Papaver somniferum
4 Helianthus annuus
1 Arabidopsis thaliana
2 Citrus sinensis
3 Nelumbo nucifera
1 Helianthus annuus
1 Zea mays
1 Coffea eugenioides
1 Musa acuminata
1 Asparagus officinalis
Thankyou.
bash shell ubuntu bioinformatics
I have a file with many column like:
ID1 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID2 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID3 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID4 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID5 XP_026389348.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Papaver_somniferum]
ID6 XP_022013305.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID7 XP_022033863.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID8 XP_022033864.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID9 XP_022033865.1_60S_ribosomal_protein_L36-2-like_[Helianthus_annuus]
ID10 NP_850400.1_Plant_stearoyl-acyl-carrier-protein_desaturase_family_protein_[Arabidopsis_thaliana]
ID11 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID12 XP_015383392.1_60S_ribosomal_protein_L36-3-like_[Citrus_sinensis]
ID13 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID14 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID15 XP_019051818.1_PREDICTED:_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_isoform_X2_[Nelumbo_nucifera]
ID16 XP_021982111.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Helianthus_annuus]
ID17 NP_001150213.1_uncharacterized_protein_LOC100283843_[Zea_mays]
ID18 XP_027164486.1_stearoyl-[acyl-carrier-protein]_9-desaturase,_chloroplastic_[Coffea_eugenioides]
ID19 XP_009419937.1_PREDICTED:_60S_ribosomal_protein_L36-3-like_[Musa_acuminata]
ID20 XP_020267482.1_60S_ribosomal_protein_L36-2-like_[Asparagus_officinalis]
i want to extract organism name from 2nd column in [ ] , and count them with their respective IDs like :
5 Papaver somniferum ID1
ID2
ID3
ID4
ID5
4 Helianthus annuus ID6
ID7
ID8
ID9
1 Arabidopsis thaliana ID10
2 Citrus sinensis ID11
ID12
3 Nelumbo nucifera ID13
ID14
ID15
1 Helianthus annuus ID16
1 Zea mays ID17
1 Coffea eugenioides ID18
1 Musa acuminata ID19
1 Asparagus officinalis ID20
I have tried something :
cat file | cut -f2 | rev |awk -F "[" '{gsub("]", "");print $1 | "rev"}' | sed '/#/d' | sort |uniq -c| sort -nr
which gives the output:
5 Papaver somniferum
4 Helianthus annuus
1 Arabidopsis thaliana
2 Citrus sinensis
3 Nelumbo nucifera
1 Helianthus annuus
1 Zea mays
1 Coffea eugenioides
1 Musa acuminata
1 Asparagus officinalis
Thankyou.
bash shell ubuntu bioinformatics
bash shell ubuntu bioinformatics
edited 24 mins ago
Kusalananda♦
162k18 gold badges320 silver badges506 bronze badges
162k18 gold badges320 silver badges506 bronze badges
asked 1 hour ago
Anjali ChaudharyAnjali Chaudhary
153 bronze badges
153 bronze badges
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f538193%2fextracting-organism-count-with-thier-corrosponding-id%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f538193%2fextracting-organism-count-with-thier-corrosponding-id%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown