Filtering records of a file based on a value of a column

up vote
2
down vote

favorite

I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g.

1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S
8 ZX 12 6
9 ZA 32 8

From this file, I would like to extract all records whose COL2=XX and YY, ZZ and all other records will be excluded (as shown below).

1 XX 45 N
2 YY 34 y
4 XX 89 Y
5 XX 45 N
6 YY 84 D

Does anybody know how to do this using sed or awk or any other UNIX tool? Thank you.

edited 4 hours ago

asked 4 hours ago

Shervan

1789

add a commentÂ |Â

up vote
2
down vote

favorite

I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g.

1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S
8 ZX 12 6
9 ZA 32 8

From this file, I would like to extract all records whose COL2=XX and YY, ZZ and all other records will be excluded (as shown below).

1 XX 45 N
2 YY 34 y
4 XX 89 Y
5 XX 45 N
6 YY 84 D

Does anybody know how to do this using sed or awk or any other UNIX tool? Thank you.

edited 4 hours ago

asked 4 hours ago

Shervan

1789

add a commentÂ |Â

up vote
2
down vote

favorite

I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g.

1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S
8 ZX 12 6
9 ZA 32 8

From this file, I would like to extract all records whose COL2=XX and YY, ZZ and all other records will be excluded (as shown below).

1 XX 45 N
2 YY 34 y
4 XX 89 Y
5 XX 45 N
6 YY 84 D

Does anybody know how to do this using sed or awk or any other UNIX tool? Thank you.

edited 4 hours ago

asked 4 hours ago

Shervan

1789

I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g.

1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S
8 ZX 12 6
9 ZA 32 8

From this file, I would like to extract all records whose COL2=XX and YY, ZZ and all other records will be excluded (as shown below).

1 XX 45 N
2 YY 34 y
4 XX 89 Y
5 XX 45 N
6 YY 84 D

Does anybody know how to do this using sed or awk or any other UNIX tool? Thank you.

bash

edited 4 hours ago

asked 4 hours ago

Shervan

1789

edited 4 hours ago

asked 4 hours ago

Shervan

1789

edited 4 hours ago

asked 4 hours ago

Shervan

1789

asked 4 hours ago

Shervan

1789

asked 4 hours ago

Shervan

1789

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
4
down vote

accepted

Let's say your data is listed in a file called file.txt.You can try awk as follows:

awk '$2 ~ /XX|YY|ZZ/' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

or grep

grep 'XX|YY|ZZ' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

To sort the data based on the third column, you can pipe the output data to the command sort as follows:

grep 'XX|YY|ZZ' file.txt | sort -k3,3n
7 ZZ 22 S
2 YY 34 y
3 ZZ 44 N
1 XX 45 N
5 XX 45 N
6 YY 84 D
4 XX 89 Y

edited 3 hours ago

Stephen Kitt

148k23326394

answered 4 hours ago

Goro

6,31652863

thank you. I have one question pelase how can i organize the new list based on the numbers in the third column
â€“Â Shervan
4 hours ago

Please see my edits.
â€“Â Goro
4 hours ago

1

Thank you so much @goro for the great help!
â€“Â Shervan
4 hours ago

@Shervan note that the grep variant matches the values anywhere in the line, not just in the second field.
â€“Â Stephen Kitt
4 hours ago

1

@Goro I was only wondering why you specified the starting character (.1), thatÃ¢Â€Â™s all. Note that your sort applies to the rest of the line, starting with the third field; I think -k3,3n would be more appropriate.
â€“Â Stephen Kitt
3 hours ago

Â |Â
show 2 more comments

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f473133%2ffiltering-records-of-a-file-based-on-a-value-of-a-column%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
4
down vote

accepted

Let's say your data is listed in a file called file.txt.You can try awk as follows:

awk '$2 ~ /XX|YY|ZZ/' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

or grep

grep 'XX|YY|ZZ' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

To sort the data based on the third column, you can pipe the output data to the command sort as follows:

grep 'XX|YY|ZZ' file.txt | sort -k3,3n
7 ZZ 22 S
2 YY 34 y
3 ZZ 44 N
1 XX 45 N
5 XX 45 N
6 YY 84 D
4 XX 89 Y

edited 3 hours ago

Stephen Kitt

148k23326394

answered 4 hours ago

Goro

6,31652863

thank you. I have one question pelase how can i organize the new list based on the numbers in the third column
â€“Â Shervan
4 hours ago

Please see my edits.
â€“Â Goro
4 hours ago

1

Thank you so much @goro for the great help!
â€“Â Shervan
4 hours ago

@Shervan note that the grep variant matches the values anywhere in the line, not just in the second field.
â€“Â Stephen Kitt
4 hours ago

1

@Goro I was only wondering why you specified the starting character (.1), thatÃ¢Â€Â™s all. Note that your sort applies to the rest of the line, starting with the third field; I think -k3,3n would be more appropriate.
â€“Â Stephen Kitt
3 hours ago

Â |Â
show 2 more comments

up vote
4
down vote

accepted

Let's say your data is listed in a file called file.txt.You can try awk as follows:

awk '$2 ~ /XX|YY|ZZ/' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

or grep

grep 'XX|YY|ZZ' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

To sort the data based on the third column, you can pipe the output data to the command sort as follows:

grep 'XX|YY|ZZ' file.txt | sort -k3,3n
7 ZZ 22 S
2 YY 34 y
3 ZZ 44 N
1 XX 45 N
5 XX 45 N
6 YY 84 D
4 XX 89 Y

edited 3 hours ago

Stephen Kitt

148k23326394

answered 4 hours ago

Goro

6,31652863

thank you. I have one question pelase how can i organize the new list based on the numbers in the third column
â€“Â Shervan
4 hours ago

Please see my edits.
â€“Â Goro
4 hours ago

1

Thank you so much @goro for the great help!
â€“Â Shervan
4 hours ago

@Shervan note that the grep variant matches the values anywhere in the line, not just in the second field.
â€“Â Stephen Kitt
4 hours ago

1

@Goro I was only wondering why you specified the starting character (.1), thatÃ¢Â€Â™s all. Note that your sort applies to the rest of the line, starting with the third field; I think -k3,3n would be more appropriate.
â€“Â Stephen Kitt
3 hours ago

Â |Â
show 2 more comments

up vote
4
down vote

accepted

Let's say your data is listed in a file called file.txt.You can try awk as follows:

awk '$2 ~ /XX|YY|ZZ/' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

or grep

grep 'XX|YY|ZZ' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

To sort the data based on the third column, you can pipe the output data to the command sort as follows:

grep 'XX|YY|ZZ' file.txt | sort -k3,3n
7 ZZ 22 S
2 YY 34 y
3 ZZ 44 N
1 XX 45 N
5 XX 45 N
6 YY 84 D
4 XX 89 Y

edited 3 hours ago

Stephen Kitt

148k23326394

answered 4 hours ago

Goro

6,31652863

Let's say your data is listed in a file called file.txt.You can try awk as follows:

awk '$2 ~ /XX|YY|ZZ/' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

or grep

grep 'XX|YY|ZZ' file.txt
1 XX 45 N
2 YY 34 y
3 ZZ 44 N
4 XX 89 Y
5 XX 45 N
6 YY 84 D
7 ZZ 22 S

To sort the data based on the third column, you can pipe the output data to the command sort as follows:

grep 'XX|YY|ZZ' file.txt | sort -k3,3n
7 ZZ 22 S
2 YY 34 y
3 ZZ 44 N
1 XX 45 N
5 XX 45 N
6 YY 84 D
4 XX 89 Y

edited 3 hours ago

Stephen Kitt

148k23326394

answered 4 hours ago

Goro

6,31652863

edited 3 hours ago

Stephen Kitt

148k23326394

edited 3 hours ago

Stephen Kitt

148k23326394

edited 3 hours ago

Stephen Kitt

148k23326394

answered 4 hours ago

Goro

6,31652863

answered 4 hours ago

Goro

6,31652863

answered 4 hours ago

Goro

6,31652863

thank you. I have one question pelase how can i organize the new list based on the numbers in the third column
â€“Â Shervan
4 hours ago

Please see my edits.
â€“Â Goro
4 hours ago

1

Thank you so much @goro for the great help!
â€“Â Shervan
4 hours ago

@Shervan note that the grep variant matches the values anywhere in the line, not just in the second field.
â€“Â Stephen Kitt
4 hours ago

1

@Goro I was only wondering why you specified the starting character (.1), thatÃ¢Â€Â™s all. Note that your sort applies to the rest of the line, starting with the third field; I think -k3,3n would be more appropriate.
â€“Â Stephen Kitt
3 hours ago

Â |Â
show 2 more comments

thank you. I have one question pelase how can i organize the new list based on the numbers in the third column
â€“Â Shervan
4 hours ago

Please see my edits.
â€“Â Goro
4 hours ago

1

Thank you so much @goro for the great help!
â€“Â Shervan
4 hours ago

@Shervan note that the grep variant matches the values anywhere in the line, not just in the second field.
â€“Â Stephen Kitt
4 hours ago

1

@Goro I was only wondering why you specified the starting character (.1), thatÃ¢Â€Â™s all. Note that your sort applies to the rest of the line, starting with the third field; I think -k3,3n would be more appropriate.
â€“Â Stephen Kitt
3 hours ago

thank you. I have one question pelase how can i organize the new list based on the numbers in the third column
â€“Â Shervan
4 hours ago

Please see my edits.
â€“Â Goro
4 hours ago

Thank you so much @goro for the great help!
â€“Â Shervan
4 hours ago

@Shervan note that the grep variant matches the values anywhere in the line, not just in the second field.
â€“Â Stephen Kitt
4 hours ago

@Goro I was only wondering why you specified the starting character (.1), thatÃ¢Â€Â™s all. Note that your sort applies to the rest of the line, starting with the third field; I think -k3,3n would be more appropriate.
â€“Â Stephen Kitt
3 hours ago

Â |Â
show 2 more comments

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky