Ordering of String
Clash Royale CLAN TAG#URR8PPP
up vote
5
down vote
favorite
I don't find these results coherent :
OrderedQ["a","A"]
True
OrderedQ["a2","A1"]
False
Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).
One could think that the explanation is :
"a" and "A" are equivalent
As
OrderedQ["A","A"]
returns True
it's normal.
But in that case OrderedQ["A","a"]
shouldn't return False.
string-manipulation sorting
add a comment |Â
up vote
5
down vote
favorite
I don't find these results coherent :
OrderedQ["a","A"]
True
OrderedQ["a2","A1"]
False
Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).
One could think that the explanation is :
"a" and "A" are equivalent
As
OrderedQ["A","A"]
returns True
it's normal.
But in that case OrderedQ["A","a"]
shouldn't return False.
string-manipulation sorting
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
3 hours ago
1
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
â Szabolcs
38 mins ago
1
Also considerOrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
â Szabolcs
35 mins ago
Example:Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.
â Szabolcs
31 mins ago
add a comment |Â
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I don't find these results coherent :
OrderedQ["a","A"]
True
OrderedQ["a2","A1"]
False
Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).
One could think that the explanation is :
"a" and "A" are equivalent
As
OrderedQ["A","A"]
returns True
it's normal.
But in that case OrderedQ["A","a"]
shouldn't return False.
string-manipulation sorting
I don't find these results coherent :
OrderedQ["a","A"]
True
OrderedQ["a2","A1"]
False
Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).
One could think that the explanation is :
"a" and "A" are equivalent
As
OrderedQ["A","A"]
returns True
it's normal.
But in that case OrderedQ["A","a"]
shouldn't return False.
string-manipulation sorting
string-manipulation sorting
edited 1 hour ago
asked 3 hours ago
andre
11.5k12248
11.5k12248
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
3 hours ago
1
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
â Szabolcs
38 mins ago
1
Also considerOrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
â Szabolcs
35 mins ago
Example:Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.
â Szabolcs
31 mins ago
add a comment |Â
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
3 hours ago
1
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
â Szabolcs
38 mins ago
1
Also considerOrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
â Szabolcs
35 mins ago
Example:Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.
â Szabolcs
31 mins ago
3
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
3 hours ago
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
3 hours ago
1
1
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,
string[1] > string2[1]
implies that string1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.â Szabolcs
38 mins ago
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,
string[1] > string2[1]
implies that string1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.â Szabolcs
38 mins ago
1
1
Also consider
OrderedQ["a", "2", "A", "1"]
--> True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase ð
and dotless lowercase ñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?â Szabolcs
35 mins ago
Also consider
OrderedQ["a", "2", "A", "1"]
--> True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase ð
and dotless lowercase ñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?â Szabolcs
35 mins ago
Example:
Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
--> "ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.â Szabolcs
31 mins ago
Example:
Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
--> "ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.â Szabolcs
31 mins ago
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
4
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
add a comment |Â
up vote
4
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
add a comment |Â
up vote
4
down vote
up vote
4
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
answered 1 hour ago
Carl Woll
60.3k279155
60.3k279155
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f183473%2fordering-of-string%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
3 hours ago
1
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,
string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.â Szabolcs
38 mins ago
1
Also consider
OrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?â Szabolcs
35 mins ago
Example:
Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.â Szabolcs
31 mins ago