Print one word more than one charter on new line using awk, sed, grep

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:



Unix & Linux Stack Exchange is a question and answer site for users of Linux,


output



Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux









share|improve this question























  • What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
    – choroba
    4 hours ago










  • This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
    – glenn jackman
    1 hour ago














up vote
1
down vote

favorite












I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:



Unix & Linux Stack Exchange is a question and answer site for users of Linux,


output



Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux









share|improve this question























  • What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
    – choroba
    4 hours ago










  • This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
    – glenn jackman
    1 hour ago












up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:



Unix & Linux Stack Exchange is a question and answer site for users of Linux,


output



Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux









share|improve this question















I have a text file, I want to print every word (more than one character) on new line. If a word consist of a single character, it must be handled as part of the following word and printed with it on a new line. If it is in the middle between two words it must follow the second word. example:



Unix & Linux Stack Exchange is a question and answer site for users of Linux,


output



Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux






text-processing awk sed grep






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 4 hours ago









choroba

24.9k34168




24.9k34168










asked 5 hours ago









Zahi

1106




1106











  • What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
    – choroba
    4 hours ago










  • This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
    – glenn jackman
    1 hour ago
















  • What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
    – choroba
    4 hours ago










  • This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
    – glenn jackman
    1 hour ago















What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
– choroba
4 hours ago




What do you mean by "if it is in the middle between two words it must follow the second word"? What should happen to a one character word if there's no word following it?
– choroba
4 hours ago












This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
– glenn jackman
1 hour ago




This is strongly related to unix.stackexchange.com/q/472204/4667 -- is this the same homework?
– glenn jackman
1 hour ago










2 Answers
2






active

oldest

votes

















up vote
3
down vote



accepted










I'd reach for Perl-flavoured regex here:



$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:



$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


I would have liked to use a word boundary marker prior to the 1-character word, but & is not a word character, so the word boundary is not useful.






share|improve this answer




















  • I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an awk statement?
    – Zahi
    23 mins ago

















up vote
1
down vote













How about



sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF> char.






share|improve this answer






















  • Very elegant solution
    – glenn jackman
    1 hour ago










  • Thank you sir for he answer
    – Zahi
    21 mins ago










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472262%2fprint-one-word-more-than-one-charter-on-new-line-using-awk-sed-grep%23new-answer', 'question_page');

);

Post as a guest






























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
3
down vote



accepted










I'd reach for Perl-flavoured regex here:



$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:



$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


I would have liked to use a word boundary marker prior to the 1-character word, but & is not a word character, so the word boundary is not useful.






share|improve this answer




















  • I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an awk statement?
    – Zahi
    23 mins ago














up vote
3
down vote



accepted










I'd reach for Perl-flavoured regex here:



$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:



$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


I would have liked to use a word boundary marker prior to the 1-character word, but & is not a word character, so the word boundary is not useful.






share|improve this answer




















  • I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an awk statement?
    – Zahi
    23 mins ago












up vote
3
down vote



accepted







up vote
3
down vote



accepted






I'd reach for Perl-flavoured regex here:



$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:



$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


I would have liked to use a word boundary marker prior to the 1-character word, but & is not a word character, so the word boundary is not useful.






share|improve this answer












I'd reach for Perl-flavoured regex here:



$ echo "$s" | grep -Po '((^|s)KSs+)?S2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


You can do the same with extended regex, but as it doesn't have pcre's lookarounds, you end up capturing the leading space:



$ echo "$s" | grep -Eo '((^|[[:blank:]])[^[:blank:]][[:blank:]]+)?[^[:blank:]]2,'
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


I would have liked to use a word boundary marker prior to the 1-character word, but & is not a word character, so the word boundary is not useful.







share|improve this answer












share|improve this answer



share|improve this answer










answered 1 hour ago









glenn jackman

48.2k365105




48.2k365105











  • I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an awk statement?
    – Zahi
    23 mins ago
















  • I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an awk statement?
    – Zahi
    23 mins ago















I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an awk statement?
– Zahi
23 mins ago




I really appreciate the answer. Very helpful! I Googled so much but I was not aware that there was similar question out there. thanks for sharing! Can I use the same regex in an awk statement?
– Zahi
23 mins ago












up vote
1
down vote













How about



sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF> char.






share|improve this answer






















  • Very elegant solution
    – glenn jackman
    1 hour ago










  • Thank you sir for he answer
    – Zahi
    21 mins ago














up vote
1
down vote













How about



sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF> char.






share|improve this answer






















  • Very elegant solution
    – glenn jackman
    1 hour ago










  • Thank you sir for he answer
    – Zahi
    21 mins ago












up vote
1
down vote










up vote
1
down vote









How about



sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF> char.






share|improve this answer














How about



sed -r 's/([^ ]2,) /1n/g' file
Unix
& Linux
Stack
Exchange
is
a question
and
answer
site
for
users
of
Linux,


Check if a space is preceded by 2 or more non-space char pattern, and substitute by "back reference" pattern plus <LF> char.







share|improve this answer














share|improve this answer



share|improve this answer








edited 1 hour ago

























answered 4 hours ago









RudiC

1,5449




1,5449











  • Very elegant solution
    – glenn jackman
    1 hour ago










  • Thank you sir for he answer
    – Zahi
    21 mins ago
















  • Very elegant solution
    – glenn jackman
    1 hour ago










  • Thank you sir for he answer
    – Zahi
    21 mins ago















Very elegant solution
– glenn jackman
1 hour ago




Very elegant solution
– glenn jackman
1 hour ago












Thank you sir for he answer
– Zahi
21 mins ago




Thank you sir for he answer
– Zahi
21 mins ago

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472262%2fprint-one-word-more-than-one-charter-on-new-line-using-awk-sed-grep%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

Long meetings (6-7 hours a day): Being “babysat” by supervisor

Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

Confectionery