OrderQ with string arguments
Clash Royale CLAN TAG#URR8PPP
up vote
9
down vote
favorite
I don't find these results consistent:
OrderedQ["a", "A"]
True
OrderedQ["a2", "A1"]
False
Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)
One could think that the explanation is: since "a" and "A" are equivalent
and as OrderedQ["A", "A"]
returns True
, it's normal. But in that case OrderedQ["A", "a"]
shouldn't return False
.
EDIT
(Thanks to @Michael E2 comments)
It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.
string-manipulation sorting
 |Â
show 1 more comment
up vote
9
down vote
favorite
I don't find these results consistent:
OrderedQ["a", "A"]
True
OrderedQ["a2", "A1"]
False
Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)
One could think that the explanation is: since "a" and "A" are equivalent
and as OrderedQ["A", "A"]
returns True
, it's normal. But in that case OrderedQ["A", "a"]
shouldn't return False
.
EDIT
(Thanks to @Michael E2 comments)
It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.
string-manipulation sorting
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
9 hours ago
2
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
â Szabolcs
7 hours ago
1
Also considerOrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
â Szabolcs
7 hours ago
Example:Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.
â Szabolcs
7 hours ago
@Szabolcs, all good points.
â kglr
4 hours ago
 |Â
show 1 more comment
up vote
9
down vote
favorite
up vote
9
down vote
favorite
I don't find these results consistent:
OrderedQ["a", "A"]
True
OrderedQ["a2", "A1"]
False
Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)
One could think that the explanation is: since "a" and "A" are equivalent
and as OrderedQ["A", "A"]
returns True
, it's normal. But in that case OrderedQ["A", "a"]
shouldn't return False
.
EDIT
(Thanks to @Michael E2 comments)
It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.
string-manipulation sorting
I don't find these results consistent:
OrderedQ["a", "A"]
True
OrderedQ["a2", "A1"]
False
Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)
One could think that the explanation is: since "a" and "A" are equivalent
and as OrderedQ["A", "A"]
returns True
, it's normal. But in that case OrderedQ["A", "a"]
shouldn't return False
.
EDIT
(Thanks to @Michael E2 comments)
It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.
string-manipulation sorting
string-manipulation sorting
edited 10 mins ago
m_goldberg
82.6k870190
82.6k870190
asked 9 hours ago
andre
11.5k12248
11.5k12248
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
9 hours ago
2
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
â Szabolcs
7 hours ago
1
Also considerOrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
â Szabolcs
7 hours ago
Example:Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.
â Szabolcs
7 hours ago
@Szabolcs, all good points.
â kglr
4 hours ago
 |Â
show 1 more comment
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
9 hours ago
2
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
â Szabolcs
7 hours ago
1
Also considerOrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
â Szabolcs
7 hours ago
Example:Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.
â Szabolcs
7 hours ago
@Szabolcs, all good points.
â kglr
4 hours ago
3
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
9 hours ago
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
9 hours ago
2
2
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,
string[1] > string2[1]
implies that string1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.â Szabolcs
7 hours ago
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,
string[1] > string2[1]
implies that string1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.â Szabolcs
7 hours ago
1
1
Also consider
OrderedQ["a", "2", "A", "1"]
--> True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase ð
and dotless lowercase ñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?â Szabolcs
7 hours ago
Also consider
OrderedQ["a", "2", "A", "1"]
--> True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase ð
and dotless lowercase ñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?â Szabolcs
7 hours ago
Example:
Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
--> "ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.â Szabolcs
7 hours ago
Example:
Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
--> "ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.â Szabolcs
7 hours ago
@Szabolcs, all good points.
â kglr
4 hours ago
@Szabolcs, all good points.
â kglr
4 hours ago
 |Â
show 1 more comment
1 Answer
1
active
oldest
votes
up vote
9
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
I would add that "dictionary order" is easier to understand with examples likeOrderedQ["ac", "Ab"]
instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
â Michael E2
5 hours ago
I don't understand the plain-English statement "It seems thatSort
treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
â Mr.Wizardâ¦
1 hour ago
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
9
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
I would add that "dictionary order" is easier to understand with examples likeOrderedQ["ac", "Ab"]
instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
â Michael E2
5 hours ago
I don't understand the plain-English statement "It seems thatSort
treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
â Mr.Wizardâ¦
1 hour ago
add a comment |Â
up vote
9
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
I would add that "dictionary order" is easier to understand with examples likeOrderedQ["ac", "Ab"]
instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
â Michael E2
5 hours ago
I don't understand the plain-English statement "It seems thatSort
treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
â Mr.Wizardâ¦
1 hour ago
add a comment |Â
up vote
9
down vote
up vote
9
down vote
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
I think you have a good question. It seems that Sort
treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:
Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]
"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"
Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.
answered 7 hours ago
Carl Woll
60.3k279155
60.3k279155
I would add that "dictionary order" is easier to understand with examples likeOrderedQ["ac", "Ab"]
instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
â Michael E2
5 hours ago
I don't understand the plain-English statement "It seems thatSort
treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
â Mr.Wizardâ¦
1 hour ago
add a comment |Â
I would add that "dictionary order" is easier to understand with examples likeOrderedQ["ac", "Ab"]
instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
â Michael E2
5 hours ago
I don't understand the plain-English statement "It seems thatSort
treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
â Mr.Wizardâ¦
1 hour ago
I would add that "dictionary order" is easier to understand with examples like
OrderedQ["ac", "Ab"]
instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.â Michael E2
5 hours ago
I would add that "dictionary order" is easier to understand with examples like
OrderedQ["ac", "Ab"]
instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.â Michael E2
5 hours ago
I don't understand the plain-English statement "It seems that
Sort
treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?â Mr.Wizardâ¦
1 hour ago
I don't understand the plain-English statement "It seems that
Sort
treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?â Mr.Wizardâ¦
1 hour ago
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f183473%2forderq-with-string-arguments%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
3
OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
â kglr
9 hours ago
2
@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation,
string[1] > string2[1]
implies thatstring1 > string2
. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.â Szabolcs
7 hours ago
1
Also consider
OrderedQ["a", "2", "A", "1"]
-->True
. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercaseð
and dotless lowercaseñ
)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?â Szabolcs
7 hours ago
Example:
Sort["I2", "ð2", "i2", "ñ2", "I1", "ð1", "i1", "ñ1"]
-->"ð1", "ð2", "i1", "ñ1", "I1", "i2", "ñ2", "I2"
. My point is that the note in the documentation does not give an unambiguous description of what is going on.â Szabolcs
7 hours ago
@Szabolcs, all good points.
â kglr
4 hours ago