Sorting values and grepping the best score (highest number)
Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
I have a file that looks like this:
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
8 C00000002 score: -39.520 nathvy = 49 nconfs = 3129
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
10 C00000002 score: -38.454 nathvy = 49 nconfs = 9473
11 C00000004 score: -37.704 nathvy = 24 nconfs = 156
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
2 C00000002 score: -48.649 nathvy = 49 nconfs = 3878
3 C00000001 score: -44.988 nathvy = 41 nconfs = 1988
4 C00000002 score: -42.674 nathvy = 49 nconfs = 6740
5 C00000002 score: -42.453 nathvy = 49 nconfs = 4553
6 C00000002 score: -41.829 nathvy = 49 nconfs = 7559
My second column are some IDs that are not sorted here, some of them are repeating, such as (C00000001
) for example. All of them have a different number assigned followed by score: (number most often starts with -
).
What I would like to do is:
1) read second column (non sorted IDs) and to always pick the first one that appears. So in case of C00000001
it would pick the on with score : -37.558
.
2) now when I have unique values presented, I would like to sort them based on the number after score:
, meaning the most negative number to be on the first position while the most positive one to be on the last position.
I would like to have output printed out the same way as my input file (same structure).
command-line bash grep sort
add a comment |Â
up vote
4
down vote
favorite
I have a file that looks like this:
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
8 C00000002 score: -39.520 nathvy = 49 nconfs = 3129
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
10 C00000002 score: -38.454 nathvy = 49 nconfs = 9473
11 C00000004 score: -37.704 nathvy = 24 nconfs = 156
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
2 C00000002 score: -48.649 nathvy = 49 nconfs = 3878
3 C00000001 score: -44.988 nathvy = 41 nconfs = 1988
4 C00000002 score: -42.674 nathvy = 49 nconfs = 6740
5 C00000002 score: -42.453 nathvy = 49 nconfs = 4553
6 C00000002 score: -41.829 nathvy = 49 nconfs = 7559
My second column are some IDs that are not sorted here, some of them are repeating, such as (C00000001
) for example. All of them have a different number assigned followed by score: (number most often starts with -
).
What I would like to do is:
1) read second column (non sorted IDs) and to always pick the first one that appears. So in case of C00000001
it would pick the on with score : -37.558
.
2) now when I have unique values presented, I would like to sort them based on the number after score:
, meaning the most negative number to be on the first position while the most positive one to be on the last position.
I would like to have output printed out the same way as my input file (same structure).
command-line bash grep sort
The first score that appears forC00000001
is-37.558
. Or is the order defined by the first column?
– Melebius
Sep 4 at 5:45
oh, thanks Melebius, my fault, will edit it now..I wrote the number with the highest score for this particular ID. So, at first step we dont look at the score, we just pick up the first unique value that appears and then organize them based on number under score, from most negative to most positive.
– djordje
Sep 4 at 5:49
add a comment |Â
up vote
4
down vote
favorite
up vote
4
down vote
favorite
I have a file that looks like this:
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
8 C00000002 score: -39.520 nathvy = 49 nconfs = 3129
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
10 C00000002 score: -38.454 nathvy = 49 nconfs = 9473
11 C00000004 score: -37.704 nathvy = 24 nconfs = 156
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
2 C00000002 score: -48.649 nathvy = 49 nconfs = 3878
3 C00000001 score: -44.988 nathvy = 41 nconfs = 1988
4 C00000002 score: -42.674 nathvy = 49 nconfs = 6740
5 C00000002 score: -42.453 nathvy = 49 nconfs = 4553
6 C00000002 score: -41.829 nathvy = 49 nconfs = 7559
My second column are some IDs that are not sorted here, some of them are repeating, such as (C00000001
) for example. All of them have a different number assigned followed by score: (number most often starts with -
).
What I would like to do is:
1) read second column (non sorted IDs) and to always pick the first one that appears. So in case of C00000001
it would pick the on with score : -37.558
.
2) now when I have unique values presented, I would like to sort them based on the number after score:
, meaning the most negative number to be on the first position while the most positive one to be on the last position.
I would like to have output printed out the same way as my input file (same structure).
command-line bash grep sort
I have a file that looks like this:
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
8 C00000002 score: -39.520 nathvy = 49 nconfs = 3129
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
10 C00000002 score: -38.454 nathvy = 49 nconfs = 9473
11 C00000004 score: -37.704 nathvy = 24 nconfs = 156
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
2 C00000002 score: -48.649 nathvy = 49 nconfs = 3878
3 C00000001 score: -44.988 nathvy = 41 nconfs = 1988
4 C00000002 score: -42.674 nathvy = 49 nconfs = 6740
5 C00000002 score: -42.453 nathvy = 49 nconfs = 4553
6 C00000002 score: -41.829 nathvy = 49 nconfs = 7559
My second column are some IDs that are not sorted here, some of them are repeating, such as (C00000001
) for example. All of them have a different number assigned followed by score: (number most often starts with -
).
What I would like to do is:
1) read second column (non sorted IDs) and to always pick the first one that appears. So in case of C00000001
it would pick the on with score : -37.558
.
2) now when I have unique values presented, I would like to sort them based on the number after score:
, meaning the most negative number to be on the first position while the most positive one to be on the last position.
I would like to have output printed out the same way as my input file (same structure).
command-line bash grep sort
edited Sep 4 at 6:03


Ravexina
27.3k146594
27.3k146594
asked Sep 4 at 5:37
djordje
1068
1068
The first score that appears forC00000001
is-37.558
. Or is the order defined by the first column?
– Melebius
Sep 4 at 5:45
oh, thanks Melebius, my fault, will edit it now..I wrote the number with the highest score for this particular ID. So, at first step we dont look at the score, we just pick up the first unique value that appears and then organize them based on number under score, from most negative to most positive.
– djordje
Sep 4 at 5:49
add a comment |Â
The first score that appears forC00000001
is-37.558
. Or is the order defined by the first column?
– Melebius
Sep 4 at 5:45
oh, thanks Melebius, my fault, will edit it now..I wrote the number with the highest score for this particular ID. So, at first step we dont look at the score, we just pick up the first unique value that appears and then organize them based on number under score, from most negative to most positive.
– djordje
Sep 4 at 5:49
The first score that appears for
C00000001
is -37.558
. Or is the order defined by the first column?– Melebius
Sep 4 at 5:45
The first score that appears for
C00000001
is -37.558
. Or is the order defined by the first column?– Melebius
Sep 4 at 5:45
oh, thanks Melebius, my fault, will edit it now..I wrote the number with the highest score for this particular ID. So, at first step we dont look at the score, we just pick up the first unique value that appears and then organize them based on number under score, from most negative to most positive.
– djordje
Sep 4 at 5:49
oh, thanks Melebius, my fault, will edit it now..I wrote the number with the highest score for this particular ID. So, at first step we dont look at the score, we just pick up the first unique value that appears and then organize them based on number under score, from most negative to most positive.
– djordje
Sep 4 at 5:49
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
8
down vote
accepted
$ sort -k2,2 -u < filename | sort -k4,4n
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
Explanation:
sort -k2,2 -u
: sorts the lines based on second column and does not change the order of them (cause they're basically the same value) and keep the first one.sort -k4,4n
: sort numerically based on the scores (there is no need for-r
to reverse it).
You should use angle brackets for filename:<filename>
. At the first moment, I thought it’s a sorting option. See docopt.org, for example.
– Melebius
Sep 4 at 6:11
2
Sure, I'll try to keep it in mind ;). but have you seen this?
– Ravexina
Sep 4 at 6:15
... or rather a variable reference like $filename. As the angle brackets are a confusing syntax for shell scripts.
– Grzegorz Oledzki
Sep 4 at 8:27
@Thor I have saw your comment the first time you post it, I'm not able to get your suggestion to work at any form, however I have updated my command (Yesterday) to:sort -k4,4n
, and it is enough to get the highest value in this situation.
– Ravexina
Sep 5 at 7:28
add a comment |Â
up vote
1
down vote
With GNU awk > 4.0:
$ gawk '
!seen[$2] seen[$2] = $0
END PROCINFO["sorted_in"] = "@val_num_asc"; for (i in seen) print seen[i]
' file
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
add a comment |Â
up vote
0
down vote
Contributing with an additional single-line command that can easily be configured
for row in $(cat tmp | awk 'print $2' | sort | uniq); do cat tmp | grep $row | head -n 1; done | sort -r --key=4
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
8
down vote
accepted
$ sort -k2,2 -u < filename | sort -k4,4n
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
Explanation:
sort -k2,2 -u
: sorts the lines based on second column and does not change the order of them (cause they're basically the same value) and keep the first one.sort -k4,4n
: sort numerically based on the scores (there is no need for-r
to reverse it).
You should use angle brackets for filename:<filename>
. At the first moment, I thought it’s a sorting option. See docopt.org, for example.
– Melebius
Sep 4 at 6:11
2
Sure, I'll try to keep it in mind ;). but have you seen this?
– Ravexina
Sep 4 at 6:15
... or rather a variable reference like $filename. As the angle brackets are a confusing syntax for shell scripts.
– Grzegorz Oledzki
Sep 4 at 8:27
@Thor I have saw your comment the first time you post it, I'm not able to get your suggestion to work at any form, however I have updated my command (Yesterday) to:sort -k4,4n
, and it is enough to get the highest value in this situation.
– Ravexina
Sep 5 at 7:28
add a comment |Â
up vote
8
down vote
accepted
$ sort -k2,2 -u < filename | sort -k4,4n
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
Explanation:
sort -k2,2 -u
: sorts the lines based on second column and does not change the order of them (cause they're basically the same value) and keep the first one.sort -k4,4n
: sort numerically based on the scores (there is no need for-r
to reverse it).
You should use angle brackets for filename:<filename>
. At the first moment, I thought it’s a sorting option. See docopt.org, for example.
– Melebius
Sep 4 at 6:11
2
Sure, I'll try to keep it in mind ;). but have you seen this?
– Ravexina
Sep 4 at 6:15
... or rather a variable reference like $filename. As the angle brackets are a confusing syntax for shell scripts.
– Grzegorz Oledzki
Sep 4 at 8:27
@Thor I have saw your comment the first time you post it, I'm not able to get your suggestion to work at any form, however I have updated my command (Yesterday) to:sort -k4,4n
, and it is enough to get the highest value in this situation.
– Ravexina
Sep 5 at 7:28
add a comment |Â
up vote
8
down vote
accepted
up vote
8
down vote
accepted
$ sort -k2,2 -u < filename | sort -k4,4n
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
Explanation:
sort -k2,2 -u
: sorts the lines based on second column and does not change the order of them (cause they're basically the same value) and keep the first one.sort -k4,4n
: sort numerically based on the scores (there is no need for-r
to reverse it).
$ sort -k2,2 -u < filename | sort -k4,4n
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
Explanation:
sort -k2,2 -u
: sorts the lines based on second column and does not change the order of them (cause they're basically the same value) and keep the first one.sort -k4,4n
: sort numerically based on the scores (there is no need for-r
to reverse it).
edited Sep 4 at 13:43
answered Sep 4 at 5:58


Ravexina
27.3k146594
27.3k146594
You should use angle brackets for filename:<filename>
. At the first moment, I thought it’s a sorting option. See docopt.org, for example.
– Melebius
Sep 4 at 6:11
2
Sure, I'll try to keep it in mind ;). but have you seen this?
– Ravexina
Sep 4 at 6:15
... or rather a variable reference like $filename. As the angle brackets are a confusing syntax for shell scripts.
– Grzegorz Oledzki
Sep 4 at 8:27
@Thor I have saw your comment the first time you post it, I'm not able to get your suggestion to work at any form, however I have updated my command (Yesterday) to:sort -k4,4n
, and it is enough to get the highest value in this situation.
– Ravexina
Sep 5 at 7:28
add a comment |Â
You should use angle brackets for filename:<filename>
. At the first moment, I thought it’s a sorting option. See docopt.org, for example.
– Melebius
Sep 4 at 6:11
2
Sure, I'll try to keep it in mind ;). but have you seen this?
– Ravexina
Sep 4 at 6:15
... or rather a variable reference like $filename. As the angle brackets are a confusing syntax for shell scripts.
– Grzegorz Oledzki
Sep 4 at 8:27
@Thor I have saw your comment the first time you post it, I'm not able to get your suggestion to work at any form, however I have updated my command (Yesterday) to:sort -k4,4n
, and it is enough to get the highest value in this situation.
– Ravexina
Sep 5 at 7:28
You should use angle brackets for filename:
<filename>
. At the first moment, I thought it’s a sorting option. See docopt.org, for example.– Melebius
Sep 4 at 6:11
You should use angle brackets for filename:
<filename>
. At the first moment, I thought it’s a sorting option. See docopt.org, for example.– Melebius
Sep 4 at 6:11
2
2
Sure, I'll try to keep it in mind ;). but have you seen this?
– Ravexina
Sep 4 at 6:15
Sure, I'll try to keep it in mind ;). but have you seen this?
– Ravexina
Sep 4 at 6:15
... or rather a variable reference like $filename. As the angle brackets are a confusing syntax for shell scripts.
– Grzegorz Oledzki
Sep 4 at 8:27
... or rather a variable reference like $filename. As the angle brackets are a confusing syntax for shell scripts.
– Grzegorz Oledzki
Sep 4 at 8:27
@Thor I have saw your comment the first time you post it, I'm not able to get your suggestion to work at any form, however I have updated my command (Yesterday) to:
sort -k4,4n
, and it is enough to get the highest value in this situation.– Ravexina
Sep 5 at 7:28
@Thor I have saw your comment the first time you post it, I'm not able to get your suggestion to work at any form, however I have updated my command (Yesterday) to:
sort -k4,4n
, and it is enough to get the highest value in this situation.– Ravexina
Sep 5 at 7:28
add a comment |Â
up vote
1
down vote
With GNU awk > 4.0:
$ gawk '
!seen[$2] seen[$2] = $0
END PROCINFO["sorted_in"] = "@val_num_asc"; for (i in seen) print seen[i]
' file
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
add a comment |Â
up vote
1
down vote
With GNU awk > 4.0:
$ gawk '
!seen[$2] seen[$2] = $0
END PROCINFO["sorted_in"] = "@val_num_asc"; for (i in seen) print seen[i]
' file
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
add a comment |Â
up vote
1
down vote
up vote
1
down vote
With GNU awk > 4.0:
$ gawk '
!seen[$2] seen[$2] = $0
END PROCINFO["sorted_in"] = "@val_num_asc"; for (i in seen) print seen[i]
' file
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
With GNU awk > 4.0:
$ gawk '
!seen[$2] seen[$2] = $0
END PROCINFO["sorted_in"] = "@val_num_asc"; for (i in seen) print seen[i]
' file
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
answered Sep 4 at 11:45
steeldriver
62.8k1197165
62.8k1197165
add a comment |Â
add a comment |Â
up vote
0
down vote
Contributing with an additional single-line command that can easily be configured
for row in $(cat tmp | awk 'print $2' | sort | uniq); do cat tmp | grep $row | head -n 1; done | sort -r --key=4
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
add a comment |Â
up vote
0
down vote
Contributing with an additional single-line command that can easily be configured
for row in $(cat tmp | awk 'print $2' | sort | uniq); do cat tmp | grep $row | head -n 1; done | sort -r --key=4
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Contributing with an additional single-line command that can easily be configured
for row in $(cat tmp | awk 'print $2' | sort | uniq); do cat tmp | grep $row | head -n 1; done | sort -r --key=4
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
Contributing with an additional single-line command that can easily be configured
for row in $(cat tmp | awk 'print $2' | sort | uniq); do cat tmp | grep $row | head -n 1; done | sort -r --key=4
7 C00000002 score: -41.156 nathvy = 49 nconfs = 2251
9 C00000004 score: -38.928 nathvy = 24 nconfs = 150
12 C00000001 score: -37.558 nathvy = 41 nconfs = 51
answered Sep 4 at 12:34
user2832190
61
61
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1071870%2fsorting-values-and-grepping-the-best-score-highest-number%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
The first score that appears for
C00000001
is-37.558
. Or is the order defined by the first column?– Melebius
Sep 4 at 5:45
oh, thanks Melebius, my fault, will edit it now..I wrote the number with the highest score for this particular ID. So, at first step we dont look at the score, we just pick up the first unique value that appears and then organize them based on number under score, from most negative to most positive.
– djordje
Sep 4 at 5:49