Print one word more than one charter on new line using awk, sed, grep
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:
Unix & Linux Stack Exchange is a question and answer site for users of Linux,
output
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux
text-processing awk sed grep
add a comment |Â
up vote
1
down vote
favorite
I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:
Unix & Linux Stack Exchange is a question and answer site for users of Linux,
output
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux
text-processing awk sed grep
What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
â choroba
4 hours ago
This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
â glenn jackman
1 hour ago
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:
Unix & Linux Stack Exchange is a question and answer site for users of Linux,
output
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux
text-processing awk sed grep
I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:
Unix & Linux Stack Exchange is a question and answer site for users of Linux,
output
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux
text-processing awk sed grep
text-processing awk sed grep
edited 4 hours ago
choroba
24.9k34168
24.9k34168
asked 5 hours ago
Zahi
1106
1106
What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
â choroba
4 hours ago
This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
â glenn jackman
1 hour ago
add a comment |Â
What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
â choroba
4 hours ago
This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
â glenn jackman
1 hour ago
What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
â choroba
4 hours ago
What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
â choroba
4 hours ago
This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
â glenn jackman
1 hour ago
This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
â glenn jackman
1 hour ago
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
3
down vote
accepted
I'd reach for Perl-flavoured regex here:
$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:
$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
I would have liked to use a word boundary marker prior to the 1-character word, but &
is not a word character, so the word boundary is not useful.
I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in anawk
statement?
â Zahi
23 mins ago
add a comment |Â
up vote
1
down vote
How about
sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF>
char.
Very elegant solution
â glenn jackman
1 hour ago
Thank you sir for he answer
â Zahi
21 mins ago
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
I'd reach for Perl-flavoured regex here:
$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:
$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
I would have liked to use a word boundary marker prior to the 1-character word, but &
is not a word character, so the word boundary is not useful.
I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in anawk
statement?
â Zahi
23 mins ago
add a comment |Â
up vote
3
down vote
accepted
I'd reach for Perl-flavoured regex here:
$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:
$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
I would have liked to use a word boundary marker prior to the 1-character word, but &
is not a word character, so the word boundary is not useful.
I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in anawk
statement?
â Zahi
23 mins ago
add a comment |Â
up vote
3
down vote
accepted
up vote
3
down vote
accepted
I'd reach for Perl-flavoured regex here:
$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:
$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
I would have liked to use a word boundary marker prior to the 1-character word, but &
is not a word character, so the word boundary is not useful.
I'd reach for Perl-flavoured regex here:
$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:
$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
I would have liked to use a word boundary marker prior to the 1-character word, but &
is not a word character, so the word boundary is not useful.
answered 1 hour ago
glenn jackman
48.2k365105
48.2k365105
I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in anawk
statement?
â Zahi
23 mins ago
add a comment |Â
I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in anawk
statement?
â Zahi
23 mins ago
I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an
awk
statement?â Zahi
23 mins ago
I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an
awk
statement?â Zahi
23 mins ago
add a comment |Â
up vote
1
down vote
How about
sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF>
char.
Very elegant solution
â glenn jackman
1 hour ago
Thank you sir for he answer
â Zahi
21 mins ago
add a comment |Â
up vote
1
down vote
How about
sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF>
char.
Very elegant solution
â glenn jackman
1 hour ago
Thank you sir for he answer
â Zahi
21 mins ago
add a comment |Â
up vote
1
down vote
up vote
1
down vote
How about
sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF>
char.
How about
sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,
Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF>
char.
edited 1 hour ago
answered 4 hours ago
RudiC
1,5449
1,5449
Very elegant solution
â glenn jackman
1 hour ago
Thank you sir for he answer
â Zahi
21 mins ago
add a comment |Â
Very elegant solution
â glenn jackman
1 hour ago
Thank you sir for he answer
â Zahi
21 mins ago
Very elegant solution
â glenn jackman
1 hour ago
Very elegant solution
â glenn jackman
1 hour ago
Thank you sir for he answer
â Zahi
21 mins ago
Thank you sir for he answer
â Zahi
21 mins ago
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472262%2fprint-one-word-more-than-one-charter-on-new-line-using-awk-sed-grep%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
â choroba
4 hours ago
This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
â glenn jackman
1 hour ago