Delimit by space but ignore backslash space

up vote
2
down vote

favorite

5678 
testing, group 
[testing 
ip 5.6.7.8 
launch-wizard-1 0.0.0.0/0
456dlkjfa 
1.2.3.4 
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32

I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so awk 'print $1' should give me

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

asked 7 hours ago

GypsyCosmonaut

683628

add a commentÂ |Â

up vote
2
down vote

favorite

5678 
testing, group 
[testing 
ip 5.6.7.8 
launch-wizard-1 0.0.0.0/0
456dlkjfa 
1.2.3.4 
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32

I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so awk 'print $1' should give me

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

asked 7 hours ago

GypsyCosmonaut

683628

add a commentÂ |Â

up vote
2
down vote

favorite

5678 
testing, group 
[testing 
ip 5.6.7.8 
launch-wizard-1 0.0.0.0/0
456dlkjfa 
1.2.3.4 
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32

I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so awk 'print $1' should give me

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

asked 7 hours ago

GypsyCosmonaut

683628

5678 
testing, group 
[testing 
ip 5.6.7.8 
launch-wizard-1 0.0.0.0/0
456dlkjfa 
1.2.3.4 
test 1.2.3.4/32 4.3.2.0/23 4.3.2.0/23
default 4.3.2.0/23 4.3.2.0/23
launch-wizard-2 0.0.0.0/0
launch-wizard-3 0.0.0.0/0
2.3.4.5/32

I would like to get the first column of the above but the catch is that, I need to treat (backslash space) as a part of the column, so awk 'print $1' should give me

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

text-processing awk sed

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

asked 7 hours ago

GypsyCosmonaut

683628

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

asked 7 hours ago

GypsyCosmonaut

683628

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

edited 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

asked 7 hours ago

GypsyCosmonaut

683628

asked 7 hours ago

GypsyCosmonaut

683628

asked 7 hours ago

GypsyCosmonaut

683628

add a commentÂ |Â

4 Answers
4

active

oldest

votes

up vote
4
down vote

accepted

with gnu awk (gawk) you can use some zero-length assertions like < or >:

$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b

but unfortunately not the full-blown ones from perl or pcre (eg. (?<!\), (?<=w), etc):

$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,

edited 3 mins ago

answered 5 hours ago

mosvy

1,2328

add a commentÂ |Â

up vote
3
down vote

You could substitute space with something else and back again afterwards.

sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'

answered 6 hours ago

RoVo

1,646213

Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â€“Â ctac_
5 hours ago

Or, awk, using the default SUBSEP variable value of 34: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â€“Â glenn jackman
3 hours ago

add a commentÂ |Â

up vote
3
down vote

With just sed:

sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile

Or shorter:

sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile

This (([^]*\ )1,)?[^ ]* matches:

[^]*\: anything that it's not a back-slash which ends with back-slash followed by a space (note that inside character class is not required to be escaped, but outside does).

([^]*\ )1,: matching above with one-or-more times of occurrences.

(([^]*\ )1,)?: this is optional when using (...)?; we could use ([^]*\ )0, instead as well or ([^]*\ )*.

((([^]*\ )1,)?[^ ]*): matches above which is optional followed by anything that it's not a space and hold as group match with 1 as its back-reference.

((([^]*\ )1,)?[^ ]*).*: matches above (...) and anything else .*.

then is replacement part just print the 1 which is the output:

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

add a commentÂ |Â

up vote
1
down vote

With GNU grep or compatible:

grep -Po '^(\.|S)*'

Or with ERE:

grep -Eo '^(\.|[^[:space:]])*'

answered 3 mins ago

StÃ©phane Chazelas

286k53527866

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f471353%2fdelimit-by-space-but-ignore-backslash-space%23new-answer', 'question_page');

);

Post as a guest

Name

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
4
down vote

accepted

with gnu awk (gawk) you can use some zero-length assertions like < or >:

$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b

but unfortunately not the full-blown ones from perl or pcre (eg. (?<!\), (?<=w), etc):

$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,

edited 3 mins ago

answered 5 hours ago

mosvy

1,2328

add a commentÂ |Â

up vote
4
down vote

accepted

with gnu awk (gawk) you can use some zero-length assertions like < or >:

$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b

but unfortunately not the full-blown ones from perl or pcre (eg. (?<!\), (?<=w), etc):

$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,

edited 3 mins ago

answered 5 hours ago

mosvy

1,2328

add a commentÂ |Â

up vote
4
down vote

accepted

with gnu awk (gawk) you can use some zero-length assertions like < or >:

$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b

but unfortunately not the full-blown ones from perl or pcre (eg. (?<!\), (?<=w), etc):

$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,

edited 3 mins ago

answered 5 hours ago

mosvy

1,2328

with gnu awk (gawk) you can use some zero-length assertions like < or >:

$ echo 'a b c' | gawk 'BEGINFS="\> +" print $1'
a b

but unfortunately not the full-blown ones from perl or pcre (eg. (?<!\), (?<=w), etc):

$ echo 'a b, c' | perl -nle '@a=split /(?<!\)s+/, $_; print $a[0]'
a b,

edited 3 mins ago

answered 5 hours ago

mosvy

1,2328

edited 3 mins ago

answered 5 hours ago

mosvy

1,2328

answered 5 hours ago

mosvy

1,2328

answered 5 hours ago

mosvy

1,2328

add a commentÂ |Â

up vote
3
down vote

You could substitute space with something else and back again afterwards.

sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'

answered 6 hours ago

RoVo

1,646213

Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â€“Â ctac_
5 hours ago

Or, awk, using the default SUBSEP variable value of 34: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â€“Â glenn jackman
3 hours ago

add a commentÂ |Â

up vote
3
down vote

You could substitute space with something else and back again afterwards.

sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'

answered 6 hours ago

RoVo

1,646213

Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â€“Â ctac_
5 hours ago

Or, awk, using the default SUBSEP variable value of 34: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â€“Â glenn jackman
3 hours ago

add a commentÂ |Â

up vote
3
down vote

You could substitute space with something else and back again afterwards.

sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'

answered 6 hours ago

RoVo

1,646213

You could substitute space with something else and back again afterwards.

sed 's/\ /\x20/g' data_file | awk ' print $1; ' | sed 's/\x20/\ /g'

answered 6 hours ago

RoVo

1,646213

answered 6 hours ago

RoVo

1,646213

answered 6 hours ago

RoVo

1,646213

answered 6 hours ago

RoVo

1,646213

Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â€“Â ctac_
5 hours ago

Or, awk, using the default SUBSEP variable value of 34: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â€“Â glenn jackman
3 hours ago

add a commentÂ |Â

Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â€“Â ctac_
5 hours ago

Or, awk, using the default SUBSEP variable value of 34: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â€“Â glenn jackman
3 hours ago

Only with sed : sed 's/\ /\x20/g;s/ .*//;s/\x20/\ /g' data_file
â€“Â ctac_
5 hours ago

Or, awk, using the default SUBSEP variable value of 34: awk 'gsub(/\ /,SUBSEP,$0); val=$1; gsub(SUBSEP,"\ ",val); print val' file
â€“Â glenn jackman
3 hours ago

add a commentÂ |Â

up vote
3
down vote

With just sed:

sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile

Or shorter:

sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile

This (([^]*\ )1,)?[^ ]* matches:

[^]*\: anything that it's not a back-slash which ends with back-slash followed by a space (note that inside character class is not required to be escaped, but outside does).

([^]*\ )1,: matching above with one-or-more times of occurrences.

(([^]*\ )1,)?: this is optional when using (...)?; we could use ([^]*\ )0, instead as well or ([^]*\ )*.

((([^]*\ )1,)?[^ ]*): matches above which is optional followed by anything that it's not a space and hold as group match with 1 as its back-reference.

((([^]*\ )1,)?[^ ]*).*: matches above (...) and anything else .*.

then is replacement part just print the 1 which is the output:

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

add a commentÂ |Â

up vote
3
down vote

With just sed:

sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile

Or shorter:

sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile

This (([^]*\ )1,)?[^ ]* matches:

[^]*\: anything that it's not a back-slash which ends with back-slash followed by a space (note that inside character class is not required to be escaped, but outside does).

([^]*\ )1,: matching above with one-or-more times of occurrences.

(([^]*\ )1,)?: this is optional when using (...)?; we could use ([^]*\ )0, instead as well or ([^]*\ )*.

((([^]*\ )1,)?[^ ]*): matches above which is optional followed by anything that it's not a space and hold as group match with 1 as its back-reference.

((([^]*\ )1,)?[^ ]*).*: matches above (...) and anything else .*.

then is replacement part just print the 1 which is the output:

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

add a commentÂ |Â

up vote
3
down vote

With just sed:

sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile

Or shorter:

sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile

This (([^]*\ )1,)?[^ ]* matches:

[^]*\: anything that it's not a back-slash which ends with back-slash followed by a space (note that inside character class is not required to be escaped, but outside does).

([^]*\ )1,: matching above with one-or-more times of occurrences.

(([^]*\ )1,)?: this is optional when using (...)?; we could use ([^]*\ )0, instead as well or ([^]*\ )*.

((([^]*\ )1,)?[^ ]*): matches above which is optional followed by anything that it's not a space and hold as group match with 1 as its back-reference.

((([^]*\ )1,)?[^ ]*).*: matches above (...) and anything else .*.

then is replacement part just print the 1 which is the output:

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

With just sed:

sed -r 's/^((([^]*\ )1,)?[^ ]*).*/1/' infile

Or shorter:

sed -r 's/^(([^]*\ )*[^ ]*).*/1/' infile

This (([^]*\ )1,)?[^ ]* matches:

[^]*\: anything that it's not a back-slash which ends with back-slash followed by a space (note that inside character class is not required to be escaped, but outside does).

([^]*\ )1,: matching above with one-or-more times of occurrences.

(([^]*\ )1,)?: this is optional when using (...)?; we could use ([^]*\ )0, instead as well or ([^]*\ )*.

((([^]*\ )1,)?[^ ]*): matches above which is optional followed by anything that it's not a space and hold as group match with 1 as its back-reference.

((([^]*\ )1,)?[^ ]*).*: matches above (...) and anything else .*.

then is replacement part just print the 1 which is the output:

5678
testing, group
[testing
ip 5.6.7.8
launch-wizard-1
456dlkjfa
1.2.3.4
test
default
launch-wizard-2
launch-wizard-3
2.3.4.5/32

edited 6 hours ago

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

edited 6 hours ago

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

answered 6 hours ago

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

16k92563

add a commentÂ |Â

up vote
1
down vote

With GNU grep or compatible:

grep -Po '^(\.|S)*'

Or with ERE:

grep -Eo '^(\.|[^[:space:]])*'

answered 3 mins ago

286k53527866

add a commentÂ |Â

up vote
1
down vote

With GNU grep or compatible:

grep -Po '^(\.|S)*'

Or with ERE:

grep -Eo '^(\.|[^[:space:]])*'

answered 3 mins ago

286k53527866

add a commentÂ |Â

up vote
1
down vote

With GNU grep or compatible:

grep -Po '^(\.|S)*'

Or with ERE:

grep -Eo '^(\.|[^[:space:]])*'

answered 3 mins ago

286k53527866

With GNU grep or compatible:

grep -Po '^(\.|S)*'

Or with ERE:

grep -Eo '^(\.|[^[:space:]])*'

answered 3 mins ago

286k53527866

answered 3 mins ago

286k53527866

answered 3 mins ago

286k53527866

answered 3 mins ago

286k53527866

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky