Get filename without extension in simple for loop
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
Mostly courtesy of Kamil, I have the following for
loop to individually sort
all text files inside of a folder (i.e. producing a sorted output file for each).
for file in *.txt;
do
printf 'Processing %sn' "$file"
LC_ALL=C sort -u "$file" > "./$file_sorted"
done
This is almost perfect, except that it currently outputs files in the format of:
originalfile.txt_sorted
...whereas I would like it to output files in the format of:
originalfile_sorted.txt
This is because the $file
variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.
How can I separate the filename from the extension so that I can add the _sorted
suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?
I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash
knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for
loop, or else that someone can explain how to apply those solutions to my situation.
windows bash cygwin bash-scripting filenames
add a comment |Â
up vote
2
down vote
favorite
Mostly courtesy of Kamil, I have the following for
loop to individually sort
all text files inside of a folder (i.e. producing a sorted output file for each).
for file in *.txt;
do
printf 'Processing %sn' "$file"
LC_ALL=C sort -u "$file" > "./$file_sorted"
done
This is almost perfect, except that it currently outputs files in the format of:
originalfile.txt_sorted
...whereas I would like it to output files in the format of:
originalfile_sorted.txt
This is because the $file
variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.
How can I separate the filename from the extension so that I can add the _sorted
suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?
I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash
knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for
loop, or else that someone can explain how to apply those solutions to my situation.
windows bash cygwin bash-scripting filenames
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Mostly courtesy of Kamil, I have the following for
loop to individually sort
all text files inside of a folder (i.e. producing a sorted output file for each).
for file in *.txt;
do
printf 'Processing %sn' "$file"
LC_ALL=C sort -u "$file" > "./$file_sorted"
done
This is almost perfect, except that it currently outputs files in the format of:
originalfile.txt_sorted
...whereas I would like it to output files in the format of:
originalfile_sorted.txt
This is because the $file
variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.
How can I separate the filename from the extension so that I can add the _sorted
suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?
I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash
knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for
loop, or else that someone can explain how to apply those solutions to my situation.
windows bash cygwin bash-scripting filenames
Mostly courtesy of Kamil, I have the following for
loop to individually sort
all text files inside of a folder (i.e. producing a sorted output file for each).
for file in *.txt;
do
printf 'Processing %sn' "$file"
LC_ALL=C sort -u "$file" > "./$file_sorted"
done
This is almost perfect, except that it currently outputs files in the format of:
originalfile.txt_sorted
...whereas I would like it to output files in the format of:
originalfile_sorted.txt
This is because the $file
variable contains the filename including the extension. I'm running Cygwin on top of Windows. I'm not sure how this would behave in a true Linux environment, but in Windows, this shifting of the extension renders the file inaccessible by Windows Explorer.
How can I separate the filename from the extension so that I can add the _sorted
suffix in between the two, allowing me to be able to easily differentiate the original and sorted versions of the files while still keeping Windows' file extensions intact?
I've been looking at what might be possible solutions, but to me these seem more equipped to dealing with more complicated problems. More importantly, with my current bash
knowledge, they go way over my head, so I'm holding out hope that there's a simpler solution which applies to my humble for
loop, or else that someone can explain how to apply those solutions to my situation.
windows bash cygwin bash-scripting filenames
windows bash cygwin bash-scripting filenames
asked 3 hours ago
Hashim
2,34962748
2,34962748
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
4
down vote
These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.
This line of yours
for file in *.txt
indicates the extension is known beforehand. In such case
basename -s .txt "$file"
should return the name without the extension (basename
also removes directory path: /directory/path/filename
→ filename
; in your case it doesn't matter because $file
doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command)
. Command substitution takes the output of some_command
, treats it as a string and places it where $(â¦)
is. Your particular redirection will be
⦠> "./$(basename -s .txt "$file")_sorted.txt"
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this
Nested quotes are OK here because Bash is smart enough to know the quotes within $(â¦)
are paired together.
This can be improved. Note basename
is a separate executable, not a shell builtin (in Bash run type basename
, compare to type cd
). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:
⦠> "./$file%.txt_sorted.txt"
The syntax is explained below for a more general case.
In case you don't know the extension:
⦠> "./$file%.*_sorted.$file##*."
The syntax explained:
$file#*.
âÂÂ$file
, but the shortest string matching*.
is removed from the front;$file##*.
âÂÂ$file
, but the longest string matching*.
is removed from the front; use it to get just an extension;$file%.*
âÂÂ$file
, but the shortest string matching.*
is removed from the end; use it to get everything but extension;$file%%.*
âÂÂ$file
, but with the longest string matching.*
is removed from the end;
Pattern matching is glob-like, not regex. This means *
is a wildcard for zero or more characters, ?
is a wildcard for exactly one character (we don't need ?
here). When you invoke ls *.txt
or for file in *.txt;
you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt
where .txt
is the pattern.
Example:
$ file=name.name2.name3.ext
$ echo "$file#*."
name2.name3.ext
$ echo "$file##*."
ext
$ echo "$file%.*"
name.name2.name3
$ echo "$file%%.*"
name
But beware:
$ file=extensionless
$ echo "$file#*."
extensionless
$ echo "$file##*."
extensionless
$ echo "$file%.*"
extensionless
$ echo "$file%%.*"
extensionless
For this reason the following contraption may be useful:
$file#$file%.*
It works by identifying everything but extension ($file%.*
), then removes this from the whole string. The results are like this:
$ file=name.name2.name3.ext
$ echo "$file#$file%.*"
.ext
$ file=extensionless
$ echo "$file#$file%.*"
$ # empty output above
Note the .
is included this time. I think you might get unexpected results if $file
contained literal *
or ?
; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.
Your improved redirection is:
⦠> "./$file%.*_sorted$file#$file%.*"
It should support filenames with or without extension.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.
This line of yours
for file in *.txt
indicates the extension is known beforehand. In such case
basename -s .txt "$file"
should return the name without the extension (basename
also removes directory path: /directory/path/filename
→ filename
; in your case it doesn't matter because $file
doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command)
. Command substitution takes the output of some_command
, treats it as a string and places it where $(â¦)
is. Your particular redirection will be
⦠> "./$(basename -s .txt "$file")_sorted.txt"
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this
Nested quotes are OK here because Bash is smart enough to know the quotes within $(â¦)
are paired together.
This can be improved. Note basename
is a separate executable, not a shell builtin (in Bash run type basename
, compare to type cd
). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:
⦠> "./$file%.txt_sorted.txt"
The syntax is explained below for a more general case.
In case you don't know the extension:
⦠> "./$file%.*_sorted.$file##*."
The syntax explained:
$file#*.
âÂÂ$file
, but the shortest string matching*.
is removed from the front;$file##*.
âÂÂ$file
, but the longest string matching*.
is removed from the front; use it to get just an extension;$file%.*
âÂÂ$file
, but the shortest string matching.*
is removed from the end; use it to get everything but extension;$file%%.*
âÂÂ$file
, but with the longest string matching.*
is removed from the end;
Pattern matching is glob-like, not regex. This means *
is a wildcard for zero or more characters, ?
is a wildcard for exactly one character (we don't need ?
here). When you invoke ls *.txt
or for file in *.txt;
you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt
where .txt
is the pattern.
Example:
$ file=name.name2.name3.ext
$ echo "$file#*."
name2.name3.ext
$ echo "$file##*."
ext
$ echo "$file%.*"
name.name2.name3
$ echo "$file%%.*"
name
But beware:
$ file=extensionless
$ echo "$file#*."
extensionless
$ echo "$file##*."
extensionless
$ echo "$file%.*"
extensionless
$ echo "$file%%.*"
extensionless
For this reason the following contraption may be useful:
$file#$file%.*
It works by identifying everything but extension ($file%.*
), then removes this from the whole string. The results are like this:
$ file=name.name2.name3.ext
$ echo "$file#$file%.*"
.ext
$ file=extensionless
$ echo "$file#$file%.*"
$ # empty output above
Note the .
is included this time. I think you might get unexpected results if $file
contained literal *
or ?
; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.
Your improved redirection is:
⦠> "./$file%.*_sorted$file#$file%.*"
It should support filenames with or without extension.
add a comment |Â
up vote
4
down vote
These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.
This line of yours
for file in *.txt
indicates the extension is known beforehand. In such case
basename -s .txt "$file"
should return the name without the extension (basename
also removes directory path: /directory/path/filename
→ filename
; in your case it doesn't matter because $file
doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command)
. Command substitution takes the output of some_command
, treats it as a string and places it where $(â¦)
is. Your particular redirection will be
⦠> "./$(basename -s .txt "$file")_sorted.txt"
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this
Nested quotes are OK here because Bash is smart enough to know the quotes within $(â¦)
are paired together.
This can be improved. Note basename
is a separate executable, not a shell builtin (in Bash run type basename
, compare to type cd
). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:
⦠> "./$file%.txt_sorted.txt"
The syntax is explained below for a more general case.
In case you don't know the extension:
⦠> "./$file%.*_sorted.$file##*."
The syntax explained:
$file#*.
âÂÂ$file
, but the shortest string matching*.
is removed from the front;$file##*.
âÂÂ$file
, but the longest string matching*.
is removed from the front; use it to get just an extension;$file%.*
âÂÂ$file
, but the shortest string matching.*
is removed from the end; use it to get everything but extension;$file%%.*
âÂÂ$file
, but with the longest string matching.*
is removed from the end;
Pattern matching is glob-like, not regex. This means *
is a wildcard for zero or more characters, ?
is a wildcard for exactly one character (we don't need ?
here). When you invoke ls *.txt
or for file in *.txt;
you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt
where .txt
is the pattern.
Example:
$ file=name.name2.name3.ext
$ echo "$file#*."
name2.name3.ext
$ echo "$file##*."
ext
$ echo "$file%.*"
name.name2.name3
$ echo "$file%%.*"
name
But beware:
$ file=extensionless
$ echo "$file#*."
extensionless
$ echo "$file##*."
extensionless
$ echo "$file%.*"
extensionless
$ echo "$file%%.*"
extensionless
For this reason the following contraption may be useful:
$file#$file%.*
It works by identifying everything but extension ($file%.*
), then removes this from the whole string. The results are like this:
$ file=name.name2.name3.ext
$ echo "$file#$file%.*"
.ext
$ file=extensionless
$ echo "$file#$file%.*"
$ # empty output above
Note the .
is included this time. I think you might get unexpected results if $file
contained literal *
or ?
; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.
Your improved redirection is:
⦠> "./$file%.*_sorted$file#$file%.*"
It should support filenames with or without extension.
add a comment |Â
up vote
4
down vote
up vote
4
down vote
These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.
This line of yours
for file in *.txt
indicates the extension is known beforehand. In such case
basename -s .txt "$file"
should return the name without the extension (basename
also removes directory path: /directory/path/filename
→ filename
; in your case it doesn't matter because $file
doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command)
. Command substitution takes the output of some_command
, treats it as a string and places it where $(â¦)
is. Your particular redirection will be
⦠> "./$(basename -s .txt "$file")_sorted.txt"
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this
Nested quotes are OK here because Bash is smart enough to know the quotes within $(â¦)
are paired together.
This can be improved. Note basename
is a separate executable, not a shell builtin (in Bash run type basename
, compare to type cd
). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:
⦠> "./$file%.txt_sorted.txt"
The syntax is explained below for a more general case.
In case you don't know the extension:
⦠> "./$file%.*_sorted.$file##*."
The syntax explained:
$file#*.
âÂÂ$file
, but the shortest string matching*.
is removed from the front;$file##*.
âÂÂ$file
, but the longest string matching*.
is removed from the front; use it to get just an extension;$file%.*
âÂÂ$file
, but the shortest string matching.*
is removed from the end; use it to get everything but extension;$file%%.*
âÂÂ$file
, but with the longest string matching.*
is removed from the end;
Pattern matching is glob-like, not regex. This means *
is a wildcard for zero or more characters, ?
is a wildcard for exactly one character (we don't need ?
here). When you invoke ls *.txt
or for file in *.txt;
you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt
where .txt
is the pattern.
Example:
$ file=name.name2.name3.ext
$ echo "$file#*."
name2.name3.ext
$ echo "$file##*."
ext
$ echo "$file%.*"
name.name2.name3
$ echo "$file%%.*"
name
But beware:
$ file=extensionless
$ echo "$file#*."
extensionless
$ echo "$file##*."
extensionless
$ echo "$file%.*"
extensionless
$ echo "$file%%.*"
extensionless
For this reason the following contraption may be useful:
$file#$file%.*
It works by identifying everything but extension ($file%.*
), then removes this from the whole string. The results are like this:
$ file=name.name2.name3.ext
$ echo "$file#$file%.*"
.ext
$ file=extensionless
$ echo "$file#$file%.*"
$ # empty output above
Note the .
is included this time. I think you might get unexpected results if $file
contained literal *
or ?
; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.
Your improved redirection is:
⦠> "./$file%.*_sorted$file#$file%.*"
It should support filenames with or without extension.
These solutions you link to are in fact quite good. Some answers may lack explanation, so let's sort it out.
This line of yours
for file in *.txt
indicates the extension is known beforehand. In such case
basename -s .txt "$file"
should return the name without the extension (basename
also removes directory path: /directory/path/filename
→ filename
; in your case it doesn't matter because $file
doesn't contain such path). To use the tool in your code, you need command substitution that looks like this in general: $(some_command)
. Command substitution takes the output of some_command
, treats it as a string and places it where $(â¦)
is. Your particular redirection will be
⦠> "./$(basename -s .txt "$file")_sorted.txt"
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^ the output of basename will replace this
Nested quotes are OK here because Bash is smart enough to know the quotes within $(â¦)
are paired together.
This can be improved. Note basename
is a separate executable, not a shell builtin (in Bash run type basename
, compare to type cd
). Spawning any extra process is costly, it takes resources and time. Spawning it in a loop usually performs poorly. Therefore you should use whatever the shell offers you to avoid extra processes. In this case the solution is:
⦠> "./$file%.txt_sorted.txt"
The syntax is explained below for a more general case.
In case you don't know the extension:
⦠> "./$file%.*_sorted.$file##*."
The syntax explained:
$file#*.
âÂÂ$file
, but the shortest string matching*.
is removed from the front;$file##*.
âÂÂ$file
, but the longest string matching*.
is removed from the front; use it to get just an extension;$file%.*
âÂÂ$file
, but the shortest string matching.*
is removed from the end; use it to get everything but extension;$file%%.*
âÂÂ$file
, but with the longest string matching.*
is removed from the end;
Pattern matching is glob-like, not regex. This means *
is a wildcard for zero or more characters, ?
is a wildcard for exactly one character (we don't need ?
here). When you invoke ls *.txt
or for file in *.txt;
you're using the same pattern matching mechanism. A pattern without wildcards is allowed. We have already used $file%.txt
where .txt
is the pattern.
Example:
$ file=name.name2.name3.ext
$ echo "$file#*."
name2.name3.ext
$ echo "$file##*."
ext
$ echo "$file%.*"
name.name2.name3
$ echo "$file%%.*"
name
But beware:
$ file=extensionless
$ echo "$file#*."
extensionless
$ echo "$file##*."
extensionless
$ echo "$file%.*"
extensionless
$ echo "$file%%.*"
extensionless
For this reason the following contraption may be useful:
$file#$file%.*
It works by identifying everything but extension ($file%.*
), then removes this from the whole string. The results are like this:
$ file=name.name2.name3.ext
$ echo "$file#$file%.*"
.ext
$ file=extensionless
$ echo "$file#$file%.*"
$ # empty output above
Note the .
is included this time. I think you might get unexpected results if $file
contained literal *
or ?
; but Windows (where extensions matter) doesn't allow these characters in filenames anyway, so you may not care.
Your improved redirection is:
⦠> "./$file%.*_sorted$file#$file%.*"
It should support filenames with or without extension.
edited 1 hour ago
answered 2 hours ago
Kamil Maciorowski
19.3k134667
19.3k134667
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1358021%2fget-filename-without-extension-in-simple-for-loop%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password