Get filename without extension in Bash

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite












I have the following for loop to individually sort all text files inside of a folder (i.e. producing a sorted output file for each).



for file in *.txt; 
do
printf 'Processing %sn' "$file"
LC_ALL=C sort -u "$file" > "./$file_sorted"
done


This is almost perfect, except that it currently outputs files in the format of:



originalfile.txt_sorted


...whereas I would like it to output files in the format of:



originalfile_sorted.txt 


This is because the $file variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.



How can I separate the filename from the extension so that I can add the _sorted suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?



I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for loop, or else that someone can explain how to apply those solutions to my situation.










share|improve this question



























    up vote
    4
    down vote

    favorite












    I have the following for loop to individually sort all text files inside of a folder (i.e. producing a sorted output file for each).



    for file in *.txt; 
    do
    printf 'Processing %sn' "$file"
    LC_ALL=C sort -u "$file" > "./$file_sorted"
    done


    This is almost perfect, except that it currently outputs files in the format of:



    originalfile.txt_sorted


    ...whereas I would like it to output files in the format of:



    originalfile_sorted.txt 


    This is because the $file variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.



    How can I separate the filename from the extension so that I can add the _sorted suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?



    I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for loop, or else that someone can explain how to apply those solutions to my situation.










    share|improve this question

























      up vote
      4
      down vote

      favorite









      up vote
      4
      down vote

      favorite











      I have the following for loop to individually sort all text files inside of a folder (i.e. producing a sorted output file for each).



      for file in *.txt; 
      do
      printf 'Processing %sn' "$file"
      LC_ALL=C sort -u "$file" > "./$file_sorted"
      done


      This is almost perfect, except that it currently outputs files in the format of:



      originalfile.txt_sorted


      ...whereas I would like it to output files in the format of:



      originalfile_sorted.txt 


      This is because the $file variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.



      How can I separate the filename from the extension so that I can add the _sorted suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?



      I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for loop, or else that someone can explain how to apply those solutions to my situation.










      share|improve this question















      I have the following for loop to individually sort all text files inside of a folder (i.e. producing a sorted output file for each).



      for file in *.txt; 
      do
      printf 'Processing %sn' "$file"
      LC_ALL=C sort -u "$file" > "./$file_sorted"
      done


      This is almost perfect, except that it currently outputs files in the format of:



      originalfile.txt_sorted


      ...whereas I would like it to output files in the format of:



      originalfile_sorted.txt 


      This is because the $file variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.



      How can I separate the filename from the extension so that I can add the _sorted suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?



      I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for loop, or else that someone can explain how to apply those solutions to my situation.







      windows bash cygwin bash-scripting filenames






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 16 mins ago









      Kamil Maciorowski

      19.3k134667




      19.3k134667










      asked 5 hours ago









      Hashim

      2,35962748




      2,35962748




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          5
          down vote













          These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.



          This line of yours



          for file in *.txt


          indicates the extension is known beforehand. In such case



          basename -s .txt "$file"


          should return the name without the extension (basename also removes directory path: /directory/path/filename → filename; in your case it doesn't matter because $file doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command). Command substitution takes the output of some_command, treats it as a string and places it where $(…) is. Your particular redirection will be



          … > "./$(basename -s .txt "$file")_sorted.txt"
          # ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this


          Nested quotes are OK here because Bash is smart enough to know the quotes within $(…) are paired together.



          This can be improved. Note basename is a separate executable, not a shell builtin (in Bash run type basename, compare to type cd). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:



          … > "./$file%.txt_sorted.txt"


          The syntax is explained below for a more general case.




          In case you don't know the extension:



          … > "./$file%.*_sorted.$file##*."


          The syntax explained:




          • $file#*. – $file, but the shortest string matching *. is removed from the front;


          • $file##*. – $file, but the longest string matching *. is removed from the front; use it to get just an extension;


          • $file%.* – $file, but the shortest string matching .* is removed from the end; use it to get everything but extension;


          • $file%%.* – $file, but with the longest string matching .* is removed from the end;

          Pattern matching is glob-like, not regex. This means * is a wildcard for zero or more characters, ? is a wildcard for exactly one character (we don't need ? in your case though). When you invoke ls *.txt or for file in *.txt; you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt where .txt is the pattern.



          Example:



          $ file=name.name2.name3.ext
          $ echo "$file#*."
          name2.name3.ext
          $ echo "$file##*."
          ext
          $ echo "$file%.*"
          name.name2.name3
          $ echo "$file%%.*"
          name


          But beware:



          $ file=extensionless
          $ echo "$file#*."
          extensionless
          $ echo "$file##*."
          extensionless
          $ echo "$file%.*"
          extensionless
          $ echo "$file%%.*"
          extensionless


          For this reason the following contraption may be useful:



          $file#$file%.*


          It works by identifying everything but extension ($file%.*), then removes this from the whole string. The results are like this:



          $ file=name.name2.name3.ext
          $ echo "$file#$file%.*"
          .ext
          $ file=extensionless
          $ echo "$file#$file%.*"

          $ # empty output above


          Note the . is included this time. I think you might get unexpected results if $file contained literal * or ?; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.



          Your improved redirection is:



          … > "./$file%.*_sorted$file#$file%.*"


          It should support filenames with or without extension.






          share|improve this answer






















            Your Answer







            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "3"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1358021%2fget-filename-without-extension-in-bash%23new-answer', 'question_page');

            );

            Post as a guest






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            5
            down vote













            These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.



            This line of yours



            for file in *.txt


            indicates the extension is known beforehand. In such case



            basename -s .txt "$file"


            should return the name without the extension (basename also removes directory path: /directory/path/filename → filename; in your case it doesn't matter because $file doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command). Command substitution takes the output of some_command, treats it as a string and places it where $(…) is. Your particular redirection will be



            … > "./$(basename -s .txt "$file")_sorted.txt"
            # ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this


            Nested quotes are OK here because Bash is smart enough to know the quotes within $(…) are paired together.



            This can be improved. Note basename is a separate executable, not a shell builtin (in Bash run type basename, compare to type cd). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:



            … > "./$file%.txt_sorted.txt"


            The syntax is explained below for a more general case.




            In case you don't know the extension:



            … > "./$file%.*_sorted.$file##*."


            The syntax explained:




            • $file#*. – $file, but the shortest string matching *. is removed from the front;


            • $file##*. – $file, but the longest string matching *. is removed from the front; use it to get just an extension;


            • $file%.* – $file, but the shortest string matching .* is removed from the end; use it to get everything but extension;


            • $file%%.* – $file, but with the longest string matching .* is removed from the end;

            Pattern matching is glob-like, not regex. This means * is a wildcard for zero or more characters, ? is a wildcard for exactly one character (we don't need ? in your case though). When you invoke ls *.txt or for file in *.txt; you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt where .txt is the pattern.



            Example:



            $ file=name.name2.name3.ext
            $ echo "$file#*."
            name2.name3.ext
            $ echo "$file##*."
            ext
            $ echo "$file%.*"
            name.name2.name3
            $ echo "$file%%.*"
            name


            But beware:



            $ file=extensionless
            $ echo "$file#*."
            extensionless
            $ echo "$file##*."
            extensionless
            $ echo "$file%.*"
            extensionless
            $ echo "$file%%.*"
            extensionless


            For this reason the following contraption may be useful:



            $file#$file%.*


            It works by identifying everything but extension ($file%.*), then removes this from the whole string. The results are like this:



            $ file=name.name2.name3.ext
            $ echo "$file#$file%.*"
            .ext
            $ file=extensionless
            $ echo "$file#$file%.*"

            $ # empty output above


            Note the . is included this time. I think you might get unexpected results if $file contained literal * or ?; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.



            Your improved redirection is:



            … > "./$file%.*_sorted$file#$file%.*"


            It should support filenames with or without extension.






            share|improve this answer


























              up vote
              5
              down vote













              These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.



              This line of yours



              for file in *.txt


              indicates the extension is known beforehand. In such case



              basename -s .txt "$file"


              should return the name without the extension (basename also removes directory path: /directory/path/filename → filename; in your case it doesn't matter because $file doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command). Command substitution takes the output of some_command, treats it as a string and places it where $(…) is. Your particular redirection will be



              … > "./$(basename -s .txt "$file")_sorted.txt"
              # ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this


              Nested quotes are OK here because Bash is smart enough to know the quotes within $(…) are paired together.



              This can be improved. Note basename is a separate executable, not a shell builtin (in Bash run type basename, compare to type cd). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:



              … > "./$file%.txt_sorted.txt"


              The syntax is explained below for a more general case.




              In case you don't know the extension:



              … > "./$file%.*_sorted.$file##*."


              The syntax explained:




              • $file#*. – $file, but the shortest string matching *. is removed from the front;


              • $file##*. – $file, but the longest string matching *. is removed from the front; use it to get just an extension;


              • $file%.* – $file, but the shortest string matching .* is removed from the end; use it to get everything but extension;


              • $file%%.* – $file, but with the longest string matching .* is removed from the end;

              Pattern matching is glob-like, not regex. This means * is a wildcard for zero or more characters, ? is a wildcard for exactly one character (we don't need ? in your case though). When you invoke ls *.txt or for file in *.txt; you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt where .txt is the pattern.



              Example:



              $ file=name.name2.name3.ext
              $ echo "$file#*."
              name2.name3.ext
              $ echo "$file##*."
              ext
              $ echo "$file%.*"
              name.name2.name3
              $ echo "$file%%.*"
              name


              But beware:



              $ file=extensionless
              $ echo "$file#*."
              extensionless
              $ echo "$file##*."
              extensionless
              $ echo "$file%.*"
              extensionless
              $ echo "$file%%.*"
              extensionless


              For this reason the following contraption may be useful:



              $file#$file%.*


              It works by identifying everything but extension ($file%.*), then removes this from the whole string. The results are like this:



              $ file=name.name2.name3.ext
              $ echo "$file#$file%.*"
              .ext
              $ file=extensionless
              $ echo "$file#$file%.*"

              $ # empty output above


              Note the . is included this time. I think you might get unexpected results if $file contained literal * or ?; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.



              Your improved redirection is:



              … > "./$file%.*_sorted$file#$file%.*"


              It should support filenames with or without extension.






              share|improve this answer
























                up vote
                5
                down vote










                up vote
                5
                down vote









                These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.



                This line of yours



                for file in *.txt


                indicates the extension is known beforehand. In such case



                basename -s .txt "$file"


                should return the name without the extension (basename also removes directory path: /directory/path/filename → filename; in your case it doesn't matter because $file doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command). Command substitution takes the output of some_command, treats it as a string and places it where $(…) is. Your particular redirection will be



                … > "./$(basename -s .txt "$file")_sorted.txt"
                # ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this


                Nested quotes are OK here because Bash is smart enough to know the quotes within $(…) are paired together.



                This can be improved. Note basename is a separate executable, not a shell builtin (in Bash run type basename, compare to type cd). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:



                … > "./$file%.txt_sorted.txt"


                The syntax is explained below for a more general case.




                In case you don't know the extension:



                … > "./$file%.*_sorted.$file##*."


                The syntax explained:




                • $file#*. – $file, but the shortest string matching *. is removed from the front;


                • $file##*. – $file, but the longest string matching *. is removed from the front; use it to get just an extension;


                • $file%.* – $file, but the shortest string matching .* is removed from the end; use it to get everything but extension;


                • $file%%.* – $file, but with the longest string matching .* is removed from the end;

                Pattern matching is glob-like, not regex. This means * is a wildcard for zero or more characters, ? is a wildcard for exactly one character (we don't need ? in your case though). When you invoke ls *.txt or for file in *.txt; you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt where .txt is the pattern.



                Example:



                $ file=name.name2.name3.ext
                $ echo "$file#*."
                name2.name3.ext
                $ echo "$file##*."
                ext
                $ echo "$file%.*"
                name.name2.name3
                $ echo "$file%%.*"
                name


                But beware:



                $ file=extensionless
                $ echo "$file#*."
                extensionless
                $ echo "$file##*."
                extensionless
                $ echo "$file%.*"
                extensionless
                $ echo "$file%%.*"
                extensionless


                For this reason the following contraption may be useful:



                $file#$file%.*


                It works by identifying everything but extension ($file%.*), then removes this from the whole string. The results are like this:



                $ file=name.name2.name3.ext
                $ echo "$file#$file%.*"
                .ext
                $ file=extensionless
                $ echo "$file#$file%.*"

                $ # empty output above


                Note the . is included this time. I think you might get unexpected results if $file contained literal * or ?; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.



                Your improved redirection is:



                … > "./$file%.*_sorted$file#$file%.*"


                It should support filenames with or without extension.






                share|improve this answer














                These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.



                This line of yours



                for file in *.txt


                indicates the extension is known beforehand. In such case



                basename -s .txt "$file"


                should return the name without the extension (basename also removes directory path: /directory/path/filename → filename; in your case it doesn't matter because $file doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command). Command substitution takes the output of some_command, treats it as a string and places it where $(…) is. Your particular redirection will be



                … > "./$(basename -s .txt "$file")_sorted.txt"
                # ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this


                Nested quotes are OK here because Bash is smart enough to know the quotes within $(…) are paired together.



                This can be improved. Note basename is a separate executable, not a shell builtin (in Bash run type basename, compare to type cd). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:



                … > "./$file%.txt_sorted.txt"


                The syntax is explained below for a more general case.




                In case you don't know the extension:



                … > "./$file%.*_sorted.$file##*."


                The syntax explained:




                • $file#*. – $file, but the shortest string matching *. is removed from the front;


                • $file##*. – $file, but the longest string matching *. is removed from the front; use it to get just an extension;


                • $file%.* – $file, but the shortest string matching .* is removed from the end; use it to get everything but extension;


                • $file%%.* – $file, but with the longest string matching .* is removed from the end;

                Pattern matching is glob-like, not regex. This means * is a wildcard for zero or more characters, ? is a wildcard for exactly one character (we don't need ? in your case though). When you invoke ls *.txt or for file in *.txt; you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt where .txt is the pattern.



                Example:



                $ file=name.name2.name3.ext
                $ echo "$file#*."
                name2.name3.ext
                $ echo "$file##*."
                ext
                $ echo "$file%.*"
                name.name2.name3
                $ echo "$file%%.*"
                name


                But beware:



                $ file=extensionless
                $ echo "$file#*."
                extensionless
                $ echo "$file##*."
                extensionless
                $ echo "$file%.*"
                extensionless
                $ echo "$file%%.*"
                extensionless


                For this reason the following contraption may be useful:



                $file#$file%.*


                It works by identifying everything but extension ($file%.*), then removes this from the whole string. The results are like this:



                $ file=name.name2.name3.ext
                $ echo "$file#$file%.*"
                .ext
                $ file=extensionless
                $ echo "$file#$file%.*"

                $ # empty output above


                Note the . is included this time. I think you might get unexpected results if $file contained literal * or ?; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.



                Your improved redirection is:



                … > "./$file%.*_sorted$file#$file%.*"


                It should support filenames with or without extension.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 10 mins ago

























                answered 5 hours ago









                Kamil Maciorowski

                19.3k134667




                19.3k134667



























                     

                    draft saved


                    draft discarded















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1358021%2fget-filename-without-extension-in-bash%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Comments

                    Popular posts from this blog

                    What does second last employer means? [closed]

                    Installing NextGIS Connect into QGIS 3?

                    One-line joke