collecting specific genome data from a file and collect it in the same title

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite












I have genomes data in a file genomes-seq.txt, the titles of the sequences begain with>then the genome name



>genome.1
atcg
atcg
atcggtc

>genome.2
atct
tgcgtgctt
attttt

>genome.
sdkf
sdf;ksdf
sdlfkjdslc
edsfsfv

>genome.3
as;ldkhaskjd
asdkljdsl
asdkljasdk;l

>genome.4
ekjfhdhsa
dsfkjskajd
asdknasd


>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.234
efijhusidh
siduhygfhuji

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


I want to collect the similar data for genome.1 in one file so it look like this



>genome.1
atcg
atcggtc

iruuwi
sdkljbh
sdfljnsdl
ljhdcj
sdljhsdil
fweusfhygc


but every time I do it using sed I get



>genome.1
atcg
atcg
atcggtc

>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


multiple genome.1 how can I do it correctly so on large data set I don't need to remove all the repetitions.










share|improve this question









New contributor




paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • Hi @paul, what is your sed command that you used?
    – Goro
    58 mins ago











  • I tried but it didn't work
    – paul
    54 mins ago










  • Show what you tried and we can help fix your errors.
    – glenn jackman
    24 mins ago














up vote
4
down vote

favorite












I have genomes data in a file genomes-seq.txt, the titles of the sequences begain with>then the genome name



>genome.1
atcg
atcg
atcggtc

>genome.2
atct
tgcgtgctt
attttt

>genome.
sdkf
sdf;ksdf
sdlfkjdslc
edsfsfv

>genome.3
as;ldkhaskjd
asdkljdsl
asdkljasdk;l

>genome.4
ekjfhdhsa
dsfkjskajd
asdknasd


>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.234
efijhusidh
siduhygfhuji

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


I want to collect the similar data for genome.1 in one file so it look like this



>genome.1
atcg
atcggtc

iruuwi
sdkljbh
sdfljnsdl
ljhdcj
sdljhsdil
fweusfhygc


but every time I do it using sed I get



>genome.1
atcg
atcg
atcggtc

>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


multiple genome.1 how can I do it correctly so on large data set I don't need to remove all the repetitions.










share|improve this question









New contributor




paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • Hi @paul, what is your sed command that you used?
    – Goro
    58 mins ago











  • I tried but it didn't work
    – paul
    54 mins ago










  • Show what you tried and we can help fix your errors.
    – glenn jackman
    24 mins ago












up vote
4
down vote

favorite









up vote
4
down vote

favorite











I have genomes data in a file genomes-seq.txt, the titles of the sequences begain with>then the genome name



>genome.1
atcg
atcg
atcggtc

>genome.2
atct
tgcgtgctt
attttt

>genome.
sdkf
sdf;ksdf
sdlfkjdslc
edsfsfv

>genome.3
as;ldkhaskjd
asdkljdsl
asdkljasdk;l

>genome.4
ekjfhdhsa
dsfkjskajd
asdknasd


>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.234
efijhusidh
siduhygfhuji

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


I want to collect the similar data for genome.1 in one file so it look like this



>genome.1
atcg
atcggtc

iruuwi
sdkljbh
sdfljnsdl
ljhdcj
sdljhsdil
fweusfhygc


but every time I do it using sed I get



>genome.1
atcg
atcg
atcggtc

>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


multiple genome.1 how can I do it correctly so on large data set I don't need to remove all the repetitions.










share|improve this question









New contributor




paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I have genomes data in a file genomes-seq.txt, the titles of the sequences begain with>then the genome name



>genome.1
atcg
atcg
atcggtc

>genome.2
atct
tgcgtgctt
attttt

>genome.
sdkf
sdf;ksdf
sdlfkjdslc
edsfsfv

>genome.3
as;ldkhaskjd
asdkljdsl
asdkljasdk;l

>genome.4
ekjfhdhsa
dsfkjskajd
asdknasd


>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.234
efijhusidh
siduhygfhuji

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


I want to collect the similar data for genome.1 in one file so it look like this



>genome.1
atcg
atcggtc

iruuwi
sdkljbh
sdfljnsdl
ljhdcj
sdljhsdil
fweusfhygc


but every time I do it using sed I get



>genome.1
atcg
atcg
atcggtc

>genome.1
iruuwi
sdkljbh
sdfljnsdl

>genome.1
ljhdcj
sdljhsdil
fweusfhygc


multiple genome.1 how can I do it correctly so on large data set I don't need to remove all the repetitions.







bash






share|improve this question









New contributor




paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 35 mins ago









Rui F Ribeiro

37k1273117




37k1273117






New contributor




paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 1 hour ago









paul

233




233




New contributor




paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






paul is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • Hi @paul, what is your sed command that you used?
    – Goro
    58 mins ago











  • I tried but it didn't work
    – paul
    54 mins ago










  • Show what you tried and we can help fix your errors.
    – glenn jackman
    24 mins ago
















  • Hi @paul, what is your sed command that you used?
    – Goro
    58 mins ago











  • I tried but it didn't work
    – paul
    54 mins ago










  • Show what you tried and we can help fix your errors.
    – glenn jackman
    24 mins ago















Hi @paul, what is your sed command that you used?
– Goro
58 mins ago





Hi @paul, what is your sed command that you used?
– Goro
58 mins ago













I tried but it didn't work
– paul
54 mins ago




I tried but it didn't work
– paul
54 mins ago












Show what you tried and we can help fix your errors.
– glenn jackman
24 mins ago




Show what you tried and we can help fix your errors.
– glenn jackman
24 mins ago










2 Answers
2






active

oldest

votes

















up vote
3
down vote



accepted










$sed -nr />genome.1/,/^$/p file | sed '2,$/^>genome.1$/d'

>genome.1
atcg
atcggtc

iruuwi
sdkljbh
sdfljnsdl
ljhdcj
sdljhsdil
fweusfhygc


genome.1 is the key word, change depending on the list you would like to generate.






share|improve this answer





























    up vote
    1
    down vote













    With perl



    perl -00 -ne 'if (/^>genome.1n/) s/// if $. > 1; print' file





    share|improve this answer




















      Your Answer







      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "106"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );






      paul is a new contributor. Be nice, and check out our Code of Conduct.









       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f474268%2fcollecting-specific-genome-data-from-a-file-and-collect-it-in-the-same-title%23new-answer', 'question_page');

      );

      Post as a guest






























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      3
      down vote



      accepted










      $sed -nr />genome.1/,/^$/p file | sed '2,$/^>genome.1$/d'

      >genome.1
      atcg
      atcggtc

      iruuwi
      sdkljbh
      sdfljnsdl
      ljhdcj
      sdljhsdil
      fweusfhygc


      genome.1 is the key word, change depending on the list you would like to generate.






      share|improve this answer


























        up vote
        3
        down vote



        accepted










        $sed -nr />genome.1/,/^$/p file | sed '2,$/^>genome.1$/d'

        >genome.1
        atcg
        atcggtc

        iruuwi
        sdkljbh
        sdfljnsdl
        ljhdcj
        sdljhsdil
        fweusfhygc


        genome.1 is the key word, change depending on the list you would like to generate.






        share|improve this answer
























          up vote
          3
          down vote



          accepted







          up vote
          3
          down vote



          accepted






          $sed -nr />genome.1/,/^$/p file | sed '2,$/^>genome.1$/d'

          >genome.1
          atcg
          atcggtc

          iruuwi
          sdkljbh
          sdfljnsdl
          ljhdcj
          sdljhsdil
          fweusfhygc


          genome.1 is the key word, change depending on the list you would like to generate.






          share|improve this answer














          $sed -nr />genome.1/,/^$/p file | sed '2,$/^>genome.1$/d'

          >genome.1
          atcg
          atcggtc

          iruuwi
          sdkljbh
          sdfljnsdl
          ljhdcj
          sdljhsdil
          fweusfhygc


          genome.1 is the key word, change depending on the list you would like to generate.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 17 mins ago

























          answered 52 mins ago









          Goro

          7,86653473




          7,86653473






















              up vote
              1
              down vote













              With perl



              perl -00 -ne 'if (/^>genome.1n/) s/// if $. > 1; print' file





              share|improve this answer
























                up vote
                1
                down vote













                With perl



                perl -00 -ne 'if (/^>genome.1n/) s/// if $. > 1; print' file





                share|improve this answer






















                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  With perl



                  perl -00 -ne 'if (/^>genome.1n/) s/// if $. > 1; print' file





                  share|improve this answer












                  With perl



                  perl -00 -ne 'if (/^>genome.1n/) s/// if $. > 1; print' file






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 20 mins ago









                  glenn jackman

                  48.4k366105




                  48.4k366105




















                      paul is a new contributor. Be nice, and check out our Code of Conduct.









                       

                      draft saved


                      draft discarded


















                      paul is a new contributor. Be nice, and check out our Code of Conduct.












                      paul is a new contributor. Be nice, and check out our Code of Conduct.











                      paul is a new contributor. Be nice, and check out our Code of Conduct.













                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f474268%2fcollecting-specific-genome-data-from-a-file-and-collect-it-in-the-same-title%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Comments

                      Popular posts from this blog

                      Long meetings (6-7 hours a day): Being “babysat” by supervisor

                      Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                      Confectionery