Capture and compile a list of names from a log file

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












I need a one-line command to compile and print all of the Expendable Launch Vehicle names listed in a log file.



The ELV names are all listed in capital letters under the /elv directory.



The output should appear in the format of one name per line, with no duplicates:



ALICE
BOB
CHARLIE


I tried



grep "GET" NASA_access_log_Aug95.txt | grep "ELV" | wc -l


but it only showed me the number of ELV not printed ELV names



Below is a sample of my log file NASA_access_log_Aug95.txt:



cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:31 -0400] "GET /elv/TITAN/mars1s.jpg HTTP/1.0" 200 1156
www-a2.proxy.aol.com - - [03/Aug/1995:20:43:31 -0400] "GET /elv/DELTA/dsolids.jpg HTTP/1.0" 200 24558
cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:32 -0400] "GET /elv/TITAN/mars3s.jpg HTTP/1.0" 200 1744
castor.gel.usherb.ca - - [03/Aug/1995:20:43:33 -0400] "GET /shuttle/missions/51-l/movies/ HTTP/1.0" 200 372
cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:33 -0400] "GET /elv/ATLAS_CENTAUR/atc69s.jpg HTTP/1.0" 200 1659
cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:35 -0400] "GET /elv/TITAN/mars2s.jpg HTTP/1.0" 200 1549
palona1.cns.hp.com - - [03/Aug/1995:20:43:36 -0400] "GET /shuttle/missions/sts-69/count69.gif HTTP/1.0" 200 46053
www-c1.proxy.aol.com - - [03/Aug/1995:20:43:38 -0400] "GET /shuttle/missions/sts-71/images/KSC-95EC-0882.gif HTTP/1.0" 200 51289
cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:40 -0400] "GET /elv/ATLAS_CENTAUR/acsuns.jpg HTTP/1.0" 200 2263
cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:41 -0400] "GET /elv/ATLAS_CENTAUR/goess.jpg HTTP/1.0" 200 1306
cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:45 -0400] "GET /elv/DELTA/dsolidss.jpg HTTP/1.0" 200 1629









share|improve this question









New contributor




강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.























    up vote
    2
    down vote

    favorite












    I need a one-line command to compile and print all of the Expendable Launch Vehicle names listed in a log file.



    The ELV names are all listed in capital letters under the /elv directory.



    The output should appear in the format of one name per line, with no duplicates:



    ALICE
    BOB
    CHARLIE


    I tried



    grep "GET" NASA_access_log_Aug95.txt | grep "ELV" | wc -l


    but it only showed me the number of ELV not printed ELV names



    Below is a sample of my log file NASA_access_log_Aug95.txt:



    cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:31 -0400] "GET /elv/TITAN/mars1s.jpg HTTP/1.0" 200 1156
    www-a2.proxy.aol.com - - [03/Aug/1995:20:43:31 -0400] "GET /elv/DELTA/dsolids.jpg HTTP/1.0" 200 24558
    cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:32 -0400] "GET /elv/TITAN/mars3s.jpg HTTP/1.0" 200 1744
    castor.gel.usherb.ca - - [03/Aug/1995:20:43:33 -0400] "GET /shuttle/missions/51-l/movies/ HTTP/1.0" 200 372
    cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:33 -0400] "GET /elv/ATLAS_CENTAUR/atc69s.jpg HTTP/1.0" 200 1659
    cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:35 -0400] "GET /elv/TITAN/mars2s.jpg HTTP/1.0" 200 1549
    palona1.cns.hp.com - - [03/Aug/1995:20:43:36 -0400] "GET /shuttle/missions/sts-69/count69.gif HTTP/1.0" 200 46053
    www-c1.proxy.aol.com - - [03/Aug/1995:20:43:38 -0400] "GET /shuttle/missions/sts-71/images/KSC-95EC-0882.gif HTTP/1.0" 200 51289
    cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:40 -0400] "GET /elv/ATLAS_CENTAUR/acsuns.jpg HTTP/1.0" 200 2263
    cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:41 -0400] "GET /elv/ATLAS_CENTAUR/goess.jpg HTTP/1.0" 200 1306
    cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:45 -0400] "GET /elv/DELTA/dsolidss.jpg HTTP/1.0" 200 1629









    share|improve this question









    New contributor




    강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      I need a one-line command to compile and print all of the Expendable Launch Vehicle names listed in a log file.



      The ELV names are all listed in capital letters under the /elv directory.



      The output should appear in the format of one name per line, with no duplicates:



      ALICE
      BOB
      CHARLIE


      I tried



      grep "GET" NASA_access_log_Aug95.txt | grep "ELV" | wc -l


      but it only showed me the number of ELV not printed ELV names



      Below is a sample of my log file NASA_access_log_Aug95.txt:



      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:31 -0400] "GET /elv/TITAN/mars1s.jpg HTTP/1.0" 200 1156
      www-a2.proxy.aol.com - - [03/Aug/1995:20:43:31 -0400] "GET /elv/DELTA/dsolids.jpg HTTP/1.0" 200 24558
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:32 -0400] "GET /elv/TITAN/mars3s.jpg HTTP/1.0" 200 1744
      castor.gel.usherb.ca - - [03/Aug/1995:20:43:33 -0400] "GET /shuttle/missions/51-l/movies/ HTTP/1.0" 200 372
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:33 -0400] "GET /elv/ATLAS_CENTAUR/atc69s.jpg HTTP/1.0" 200 1659
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:35 -0400] "GET /elv/TITAN/mars2s.jpg HTTP/1.0" 200 1549
      palona1.cns.hp.com - - [03/Aug/1995:20:43:36 -0400] "GET /shuttle/missions/sts-69/count69.gif HTTP/1.0" 200 46053
      www-c1.proxy.aol.com - - [03/Aug/1995:20:43:38 -0400] "GET /shuttle/missions/sts-71/images/KSC-95EC-0882.gif HTTP/1.0" 200 51289
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:40 -0400] "GET /elv/ATLAS_CENTAUR/acsuns.jpg HTTP/1.0" 200 2263
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:41 -0400] "GET /elv/ATLAS_CENTAUR/goess.jpg HTTP/1.0" 200 1306
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:45 -0400] "GET /elv/DELTA/dsolidss.jpg HTTP/1.0" 200 1629









      share|improve this question









      New contributor




      강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I need a one-line command to compile and print all of the Expendable Launch Vehicle names listed in a log file.



      The ELV names are all listed in capital letters under the /elv directory.



      The output should appear in the format of one name per line, with no duplicates:



      ALICE
      BOB
      CHARLIE


      I tried



      grep "GET" NASA_access_log_Aug95.txt | grep "ELV" | wc -l


      but it only showed me the number of ELV not printed ELV names



      Below is a sample of my log file NASA_access_log_Aug95.txt:



      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:31 -0400] "GET /elv/TITAN/mars1s.jpg HTTP/1.0" 200 1156
      www-a2.proxy.aol.com - - [03/Aug/1995:20:43:31 -0400] "GET /elv/DELTA/dsolids.jpg HTTP/1.0" 200 24558
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:32 -0400] "GET /elv/TITAN/mars3s.jpg HTTP/1.0" 200 1744
      castor.gel.usherb.ca - - [03/Aug/1995:20:43:33 -0400] "GET /shuttle/missions/51-l/movies/ HTTP/1.0" 200 372
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:33 -0400] "GET /elv/ATLAS_CENTAUR/atc69s.jpg HTTP/1.0" 200 1659
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:35 -0400] "GET /elv/TITAN/mars2s.jpg HTTP/1.0" 200 1549
      palona1.cns.hp.com - - [03/Aug/1995:20:43:36 -0400] "GET /shuttle/missions/sts-69/count69.gif HTTP/1.0" 200 46053
      www-c1.proxy.aol.com - - [03/Aug/1995:20:43:38 -0400] "GET /shuttle/missions/sts-71/images/KSC-95EC-0882.gif HTTP/1.0" 200 51289
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:40 -0400] "GET /elv/ATLAS_CENTAUR/acsuns.jpg HTTP/1.0" 200 2263
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:41 -0400] "GET /elv/ATLAS_CENTAUR/goess.jpg HTTP/1.0" 200 1306
      cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:45 -0400] "GET /elv/DELTA/dsolidss.jpg HTTP/1.0" 200 1629






      command-line text-processing grep






      share|improve this question









      New contributor




      강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 8 mins ago









      Zanna

      48.2k13120228




      48.2k13120228






      New contributor




      강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 4 hours ago









      강찬희

      112




      112




      New contributor




      강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      강찬희 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          4 Answers
          4






          active

          oldest

          votes

















          up vote
          2
          down vote













          Don't you need just:



          awk -F'/' '/elv/ && !seen[$5]++ print $5' infile


          for the given sample, output would be:



          TITAN
          DELTA
          ATLAS_CENTAUR





          share|improve this answer



























            up vote
            2
            down vote













            You can do it like this:



            grep 'elv' NASA_access_log_Aug95.txt | awk 'print $7' | sed 's/[a-z0-9./]//g' | sort -u


            Given your example snippet from the log file this will output:



            ATLAS_CENTAUR
            DELTA
            TITAN


            Explanation of the piped commands in order they occur:




            • grep 'elv' NASA_access_log_Aug95.txt



              Will output you all lines containing elv



              cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:31 -0400] "GET /elv/TITAN/mars1s.jpg HTTP/1.0" 200 1156
              www-a2.proxy.aol.com - - [03/Aug/1995:20:43:31 -0400] "GET /elv/DELTA/dsolids.jpg HTTP/1.0" 200 24558
              cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:32 -0400] "GET /elv/TITAN/mars3s.jpg HTTP/1.0" 200 1744
              cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:33 -0400] "GET /elv/ATLAS_CENTAUR/atc69s.jpg HTTP/1.0" 200 1659
              cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:35 -0400] "GET /elv/TITAN/mars2s.jpg HTTP/1.0" 200 1549
              cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:40 -0400] "GET /elv/ATLAS_CENTAUR/acsuns.jpg HTTP/1.0" 200 2263
              cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:41 -0400] "GET /elv/ATLAS_CENTAUR/goess.jpg HTTP/1.0" 200 1306
              cc-rd6-mg1-dip4-9.massey.ac.nz - - [03/Aug/1995:20:43:45 -0400] "GET /elv/DELTA/dsolidss.jpg HTTP/1.0" 200 1629



            • awk 'print $7'



              Will give you the 7th column information (the one you want). Remember That this counts colums divided by spaces.



              /elv/TITAN/mars1s.jpg
              /elv/DELTA/dsolids.jpg
              /elv/TITAN/mars3s.jpg
              /elv/ATLAS_CENTAUR/atc69s.jpg
              /elv/TITAN/mars2s.jpg
              /elv/ATLAS_CENTAUR/acsuns.jpg
              /elv/ATLAS_CENTAUR/goess.jpg
              /elv/DELTA/dsolidss.jpg



            • sed 's/[a-z0-9./]//g'



              Will filter out all unwanted characters (i.e. lower case a-z, numbers 0-9, . and /)



              TITAN
              DELTA
              TITAN
              ATLAS_CENTAUR
              TITAN
              ATLAS_CENTAUR
              ATLAS_CENTAUR
              DELTA



            • sort -u



              Will prevent duplicates from appearing and sorts them alphabetically.



              ATLAS_CENTAUR
              DELTA
              TITAN






            share|improve this answer





























              up vote
              2
              down vote













              With Perl, regex matching the /-delimited elements after elv and pushing them into a hash:



              $ perl -lne '$h$1++ if m:/elv/(.*?)/: }/elv/ for $k (sort keys %h) print $k' NASA_access_log_Aug95.txt 
              ATLAS_CENTAUR
              DELTA
              TITAN





              share 








              up vote
              2
              down vote










              up vote
              2
              down vote









              With Perl, regex matching the /-delimited elements after elv and pushing them into a hash:



              $ perl -lne '$h$1++ if m:/elv/(.*?)/: improve this answer












              With Perl, regex matching the /-delimited elements after elv and pushing them into a hash:



              $ perl -lne '$h$1++ if m:/elv/(.*?)/: { for $k (sort keys %h) print $k' NASA_access_log_Aug95.txt 
              ATLAS_CENTAUR
              DELTA
              TITAN






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered 58 mins ago









              steeldriver

              63.4k1199167




              63.4k1199167




















                  up vote
                  1
                  down vote













                  You can also use sed with just a little help from sort



                  $ sed -rn '|/elv/| s|.*/elv/([^/]+).*|1|p' NASA_access_log_Aug95.txt | sort -u
                  ATLAS_CENTAUR
                  DELTA
                  TITAN


                  Explanation




                  • -r Use extended regex (saves a couple of backslashes)


                  • -n Don't print the lines we don't ask for


                  • |/elv/| find lines with /elv/ (the | at the start means use | not / to delimit the address)


                  • s|old|new| replace old with new


                  • .*/elv/ any characters before and including /elv/


                  • ([^/]+) save all the characters until the next /


                  • .* any number of any characters


                  • 1 reference to the characters we saved


                  • p print the lines we worked on


                  • sort -u sort the input and remove duplicates




                  share
























                    up vote
                    1
                    down vote













                    You can also use sed with just a little help from sort



                    $ sed -rn '|/elv/| s|.*/elv/([^/]+).*|1|p' NASA_access_log_Aug95.txt | sort -u
                    ATLAS_CENTAUR
                    DELTA
                    TITAN


                    Explanation




                    • -r Use extended regex (saves a couple of backslashes)


                    • -n Don't print the lines we don't ask for


                    • |/elv/| find lines with /elv/ (the | at the start means use | not / to delimit the address)


                    • s|old|new| replace old with new


                    • .*/elv/ any characters before and including /elv/


                    • ([^/]+) save all the characters until the next /


                    • .* any number of any characters


                    • 1 reference to the characters we saved


                    • p print the lines we worked on


                    • sort -u sort the input and remove duplicates




                    share






















                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      You can also use sed with just a little help from sort



                      $ sed -rn '|/elv/| s|.*/elv/([^/]+).*|1|p' NASA_access_log_Aug95.txt | sort -u
                      ATLAS_CENTAUR
                      DELTA
                      TITAN


                      Explanation




                      • -r Use extended regex (saves a couple of backslashes)


                      • -n Don't print the lines we don't ask for


                      • |/elv/| find lines with /elv/ (the | at the start means use | not / to delimit the address)


                      • s|old|new| replace old with new


                      • .*/elv/ any characters before and including /elv/


                      • ([^/]+) save all the characters until the next /


                      • .* any number of any characters


                      • 1 reference to the characters we saved


                      • p print the lines we worked on


                      • sort -u sort the input and remove duplicates




                      share












                      You can also use sed with just a little help from sort



                      $ sed -rn '|/elv/| s|.*/elv/([^/]+).*|1|p' NASA_access_log_Aug95.txt | sort -u
                      ATLAS_CENTAUR
                      DELTA
                      TITAN


                      Explanation




                      • -r Use extended regex (saves a couple of backslashes)


                      • -n Don't print the lines we don't ask for


                      • |/elv/| find lines with /elv/ (the | at the start means use | not / to delimit the address)


                      • s|old|new| replace old with new


                      • .*/elv/ any characters before and including /elv/


                      • ([^/]+) save all the characters until the next /


                      • .* any number of any characters


                      • 1 reference to the characters we saved


                      • p print the lines we worked on


                      • sort -u sort the input and remove duplicates





                      share











                      share


                      share










                      answered 9 mins ago









                      Zanna

                      48.2k13120228




                      48.2k13120228




















                          강찬희 is a new contributor. Be nice, and check out our Code of Conduct.









                           

                          draft saved


                          draft discarded


















                          강찬희 is a new contributor. Be nice, and check out our Code of Conduct.












                          강찬희 is a new contributor. Be nice, and check out our Code of Conduct.











                          강찬희 is a new contributor. Be nice, and check out our Code of Conduct.













                           


                          draft saved


                          draft discarded














                          StackExchange.ready(
                          function ()
                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1079432%2fcapture-and-compile-a-list-of-names-from-a-log-file%23new-answer', 'question_page');

                          );

                          Post as a guest













































































                          Comments

                          Popular posts from this blog

                          White Anglo-Saxon Protestant

                          BuddyTV

                          Conflict (narrative)