how to fatch fasta sequences if header line match from another fileGetting matched fasta fileRemove line...

What does formal training in a field mean?

Why do Thanos' punches not kill Captain America or at least cause vital wounds?

Can a surprised creature fall prone voluntarily on their turn?

What can cause an unfrozen indoor copper drain pipe to crack?

How to get the IP of a user who executed a command?

Names of the Six Tastes

Has magnetic core memory been used beyond the Moon?

Translation of the latin word 'sit' in Thomas Aquinas' works

Pre-1993 comic in which Wolverine's claws were turned to rubber?

Would encrypting a database protect against a compromised admin account?

How is CoreiX like Corei5, i7 is related to Haswell, Ivy Bridge?

date -d 'previous Monday" to display the preceding Monday

Why can't I prove summation identities without guessing?

Was Mohammed the most popular first name for boys born in Berlin in 2018?

Peculiarities in low dimensions or low order or etc

Electric kick drum pedal starts oscillating in such a way that it does not register hits

A Cunning Riley Riddle

spatiotemporal regression

Why did Captain America age?

Is it bad writing or bad story telling if first person narrative contains more information than the narrator knows?

Is every story set in the future "science fiction"?

Should I pay on student loans in deferment or continue to snowball other debts?

What does this quote in Small Gods refer to?

How to handle DM constantly stealing everything from sleeping characters?



how to fatch fasta sequences if header line match from another file


Getting matched fasta fileRemove line breaks in a FASTA fileExtracting subset from fasta fileExtract sequences from a fasta fileOnly one line break in fasta filehow to remove newline characters in fasta sequenceHow to match a pattern in lines before another pattern matchextract fasta entries from list using while readHow to match a column from File1 to get its corresponding fasta sequences in File 2?replace header in a file with list of lines in another file






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







-1















I have a file of lines of headers (file 1) and another file is sequences in fasta format (file 2). I want grep fasta sequences if header line from file 1 match from file 2.
Example:
File 1:



>sp|B7UM99|TIR_ECO27
>sp|P06616|ERA_ECOLI


File 2:



>sp|B7UM99|TIR_ECO27
MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
RGGTGHLISSTGALGSRSLFSPLRNSMADS
VDSRDIPGLPTNPSRLAAATSETCLLGGFE
VLHDKGPLDILNTQIGPSAFRVEVQADGTH
......
>sp|P06616|ERA_ECOLI
MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
VDTPGLHMEEKRAINRLMNKAASSSIGDVE
LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
............
>sp|P0AD68|HUMAN
MKAAAKTQKPKRQEEHANFISWRFALLCGC
ILLALAFLLGRVAWLQVISPDMLVKEGDMR
SLRVQQVSTSRGMITDRSGRPLAVSVPVKA
IWADPKEVHDAGGISVGDRWKALANALNIP
.............


DESIRED OUTPUT



>sp|B7UM99|TIR_ECO27
MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
RGGTGHLISSTGALGSRSLFSPLRNSMADS
VDSRDIPGLPTNPSRLAAATSETCLLGGFE
VLHDKGPLDILNTQIGPSAFRVEVQADGTH
......
>sp|P06616|ERA_ECOLI
MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
VDTPGLHMEEKRAINRLMNKAASSSIGDVE
LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
............









share|improve this question































    -1















    I have a file of lines of headers (file 1) and another file is sequences in fasta format (file 2). I want grep fasta sequences if header line from file 1 match from file 2.
    Example:
    File 1:



    >sp|B7UM99|TIR_ECO27
    >sp|P06616|ERA_ECOLI


    File 2:



    >sp|B7UM99|TIR_ECO27
    MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
    RGGTGHLISSTGALGSRSLFSPLRNSMADS
    VDSRDIPGLPTNPSRLAAATSETCLLGGFE
    VLHDKGPLDILNTQIGPSAFRVEVQADGTH
    ......
    >sp|P06616|ERA_ECOLI
    MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
    GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
    VDTPGLHMEEKRAINRLMNKAASSSIGDVE
    LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
    ............
    >sp|P0AD68|HUMAN
    MKAAAKTQKPKRQEEHANFISWRFALLCGC
    ILLALAFLLGRVAWLQVISPDMLVKEGDMR
    SLRVQQVSTSRGMITDRSGRPLAVSVPVKA
    IWADPKEVHDAGGISVGDRWKALANALNIP
    .............


    DESIRED OUTPUT



    >sp|B7UM99|TIR_ECO27
    MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
    RGGTGHLISSTGALGSRSLFSPLRNSMADS
    VDSRDIPGLPTNPSRLAAATSETCLLGGFE
    VLHDKGPLDILNTQIGPSAFRVEVQADGTH
    ......
    >sp|P06616|ERA_ECOLI
    MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
    GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
    VDTPGLHMEEKRAINRLMNKAASSSIGDVE
    LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
    ............









    share|improve this question



























      -1












      -1








      -1








      I have a file of lines of headers (file 1) and another file is sequences in fasta format (file 2). I want grep fasta sequences if header line from file 1 match from file 2.
      Example:
      File 1:



      >sp|B7UM99|TIR_ECO27
      >sp|P06616|ERA_ECOLI


      File 2:



      >sp|B7UM99|TIR_ECO27
      MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
      RGGTGHLISSTGALGSRSLFSPLRNSMADS
      VDSRDIPGLPTNPSRLAAATSETCLLGGFE
      VLHDKGPLDILNTQIGPSAFRVEVQADGTH
      ......
      >sp|P06616|ERA_ECOLI
      MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
      GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
      VDTPGLHMEEKRAINRLMNKAASSSIGDVE
      LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
      ............
      >sp|P0AD68|HUMAN
      MKAAAKTQKPKRQEEHANFISWRFALLCGC
      ILLALAFLLGRVAWLQVISPDMLVKEGDMR
      SLRVQQVSTSRGMITDRSGRPLAVSVPVKA
      IWADPKEVHDAGGISVGDRWKALANALNIP
      .............


      DESIRED OUTPUT



      >sp|B7UM99|TIR_ECO27
      MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
      RGGTGHLISSTGALGSRSLFSPLRNSMADS
      VDSRDIPGLPTNPSRLAAATSETCLLGGFE
      VLHDKGPLDILNTQIGPSAFRVEVQADGTH
      ......
      >sp|P06616|ERA_ECOLI
      MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
      GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
      VDTPGLHMEEKRAINRLMNKAASSSIGDVE
      LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
      ............









      share|improve this question
















      I have a file of lines of headers (file 1) and another file is sequences in fasta format (file 2). I want grep fasta sequences if header line from file 1 match from file 2.
      Example:
      File 1:



      >sp|B7UM99|TIR_ECO27
      >sp|P06616|ERA_ECOLI


      File 2:



      >sp|B7UM99|TIR_ECO27
      MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
      RGGTGHLISSTGALGSRSLFSPLRNSMADS
      VDSRDIPGLPTNPSRLAAATSETCLLGGFE
      VLHDKGPLDILNTQIGPSAFRVEVQADGTH
      ......
      >sp|P06616|ERA_ECOLI
      MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
      GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
      VDTPGLHMEEKRAINRLMNKAASSSIGDVE
      LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
      ............
      >sp|P0AD68|HUMAN
      MKAAAKTQKPKRQEEHANFISWRFALLCGC
      ILLALAFLLGRVAWLQVISPDMLVKEGDMR
      SLRVQQVSTSRGMITDRSGRPLAVSVPVKA
      IWADPKEVHDAGGISVGDRWKALANALNIP
      .............


      DESIRED OUTPUT



      >sp|B7UM99|TIR_ECO27
      MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
      RGGTGHLISSTGALGSRSLFSPLRNSMADS
      VDSRDIPGLPTNPSRLAAATSETCLLGGFE
      VLHDKGPLDILNTQIGPSAFRVEVQADGTH
      ......
      >sp|P06616|ERA_ECOLI
      MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
      GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
      VDTPGLHMEEKRAINRLMNKAASSSIGDVE
      LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
      ............






      linux text-processing bioinformatics






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 3 hours ago









      Kusalananda

      145k18274454




      145k18274454










      asked 3 hours ago









      Manoj KumarManoj Kumar

      12




      12






















          1 Answer
          1






          active

          oldest

          votes


















          0














          grep has -f flag which allows using patterns from file, and in your case we also need 5 lines agtee the matching header. Thus what we can do is



          grep -A5 -f file1.txt file2.txt


          Note that files are assumed to be in current working directory. If they are not, use cd command to navigate there or provide full paths to each file






          share|improve this answer
























          • it has still following error ........... grep: final-out.txt:23097: Invalid range end

            – Manoj Kumar
            19 mins ago











          • Moreover, I tried with the command... grep -F -A10 -f final-out.txt output.fasta >database.fasta But, it is showing sequences which are not in file 1.

            – Manoj Kumar
            7 mins ago












          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f518071%2fhow-to-fatch-fasta-sequences-if-header-line-match-from-another-file%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          grep has -f flag which allows using patterns from file, and in your case we also need 5 lines agtee the matching header. Thus what we can do is



          grep -A5 -f file1.txt file2.txt


          Note that files are assumed to be in current working directory. If they are not, use cd command to navigate there or provide full paths to each file






          share|improve this answer
























          • it has still following error ........... grep: final-out.txt:23097: Invalid range end

            – Manoj Kumar
            19 mins ago











          • Moreover, I tried with the command... grep -F -A10 -f final-out.txt output.fasta >database.fasta But, it is showing sequences which are not in file 1.

            – Manoj Kumar
            7 mins ago
















          0














          grep has -f flag which allows using patterns from file, and in your case we also need 5 lines agtee the matching header. Thus what we can do is



          grep -A5 -f file1.txt file2.txt


          Note that files are assumed to be in current working directory. If they are not, use cd command to navigate there or provide full paths to each file






          share|improve this answer
























          • it has still following error ........... grep: final-out.txt:23097: Invalid range end

            – Manoj Kumar
            19 mins ago











          • Moreover, I tried with the command... grep -F -A10 -f final-out.txt output.fasta >database.fasta But, it is showing sequences which are not in file 1.

            – Manoj Kumar
            7 mins ago














          0












          0








          0







          grep has -f flag which allows using patterns from file, and in your case we also need 5 lines agtee the matching header. Thus what we can do is



          grep -A5 -f file1.txt file2.txt


          Note that files are assumed to be in current working directory. If they are not, use cd command to navigate there or provide full paths to each file






          share|improve this answer













          grep has -f flag which allows using patterns from file, and in your case we also need 5 lines agtee the matching header. Thus what we can do is



          grep -A5 -f file1.txt file2.txt


          Note that files are assumed to be in current working directory. If they are not, use cd command to navigate there or provide full paths to each file







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 46 mins ago









          Sergiy KolodyazhnyySergiy Kolodyazhnyy

          10.8k42765




          10.8k42765













          • it has still following error ........... grep: final-out.txt:23097: Invalid range end

            – Manoj Kumar
            19 mins ago











          • Moreover, I tried with the command... grep -F -A10 -f final-out.txt output.fasta >database.fasta But, it is showing sequences which are not in file 1.

            – Manoj Kumar
            7 mins ago



















          • it has still following error ........... grep: final-out.txt:23097: Invalid range end

            – Manoj Kumar
            19 mins ago











          • Moreover, I tried with the command... grep -F -A10 -f final-out.txt output.fasta >database.fasta But, it is showing sequences which are not in file 1.

            – Manoj Kumar
            7 mins ago

















          it has still following error ........... grep: final-out.txt:23097: Invalid range end

          – Manoj Kumar
          19 mins ago





          it has still following error ........... grep: final-out.txt:23097: Invalid range end

          – Manoj Kumar
          19 mins ago













          Moreover, I tried with the command... grep -F -A10 -f final-out.txt output.fasta >database.fasta But, it is showing sequences which are not in file 1.

          – Manoj Kumar
          7 mins ago





          Moreover, I tried with the command... grep -F -A10 -f final-out.txt output.fasta >database.fasta But, it is showing sequences which are not in file 1.

          – Manoj Kumar
          7 mins ago


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f518071%2fhow-to-fatch-fasta-sequences-if-header-line-match-from-another-file%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Hudson River Historic District Contents Geography History The district today Aesthetics Cultural...

          The number designs the writing. Feandra Aversely Definition: The act of ingrafting a sprig or shoot of one...

          Ayherre Geografie Demografie Externe links Navigatiemenu43° 23′ NB, 1° 15′ WL43° 23′ NB, 1°...