OrderQ with string arguments

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
9
down vote

favorite
1












I don't find these results consistent:



OrderedQ["a", "A"]



True




OrderedQ["a2", "A1"]



False




Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)



One could think that the explanation is: since "a" and "A" are equivalent

and as OrderedQ["A", "A"] returns True, it's normal. But in that case OrderedQ["A", "a"] shouldn't return False.



EDIT



(Thanks to @Michael E2 comments)



It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.










share|improve this question



















  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    9 hours ago







  • 2




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    7 hours ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    7 hours ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    7 hours ago










  • @Szabolcs, all good points.
    – kglr
    4 hours ago














up vote
9
down vote

favorite
1












I don't find these results consistent:



OrderedQ["a", "A"]



True




OrderedQ["a2", "A1"]



False




Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)



One could think that the explanation is: since "a" and "A" are equivalent

and as OrderedQ["A", "A"] returns True, it's normal. But in that case OrderedQ["A", "a"] shouldn't return False.



EDIT



(Thanks to @Michael E2 comments)



It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.










share|improve this question



















  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    9 hours ago







  • 2




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    7 hours ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    7 hours ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    7 hours ago










  • @Szabolcs, all good points.
    – kglr
    4 hours ago












up vote
9
down vote

favorite
1









up vote
9
down vote

favorite
1






1





I don't find these results consistent:



OrderedQ["a", "A"]



True




OrderedQ["a2", "A1"]



False




Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)



One could think that the explanation is: since "a" and "A" are equivalent

and as OrderedQ["A", "A"] returns True, it's normal. But in that case OrderedQ["A", "a"] shouldn't return False.



EDIT



(Thanks to @Michael E2 comments)



It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.










share|improve this question















I don't find these results consistent:



OrderedQ["a", "A"]



True




OrderedQ["a2", "A1"]



False




Is there any explanation of that somewhere? (In fact it is not necessarily related to Mathematica, maybe there are some standards or established conventions about this.)



One could think that the explanation is: since "a" and "A" are equivalent

and as OrderedQ["A", "A"] returns True, it's normal. But in that case OrderedQ["A", "a"] shouldn't return False.



EDIT



(Thanks to @Michael E2 comments)



It turns out that this question has nothing to do with the fact that "1" and "2" are digit characters. The same thing happens if one replaces "1" by "c" and "2" by "d" for example.







string-manipulation sorting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 10 mins ago









m_goldberg

82.6k870190




82.6k870190










asked 9 hours ago









andre

11.5k12248




11.5k12248







  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    9 hours ago







  • 2




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    7 hours ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    7 hours ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    7 hours ago










  • @Szabolcs, all good points.
    – kglr
    4 hours ago












  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    9 hours ago







  • 2




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    7 hours ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    7 hours ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    7 hours ago










  • @Szabolcs, all good points.
    – kglr
    4 hours ago







3




3




OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
– kglr
9 hours ago





OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
– kglr
9 hours ago





2




2




@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
– Szabolcs
7 hours ago




@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
– Szabolcs
7 hours ago




1




1




Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
– Szabolcs
7 hours ago





Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
– Szabolcs
7 hours ago













Example: Sort["I2", "İ2", "i2", "ı2", "I1", "İ1", "i1", "ı1"] --> "İ1", "İ2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
– Szabolcs
7 hours ago




Example: Sort["I2", "İ2", "i2", "ı2", "I1", "İ1", "i1", "ı1"] --> "İ1", "İ2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
– Szabolcs
7 hours ago












@Szabolcs, all good points.
– kglr
4 hours ago




@Szabolcs, all good points.
– kglr
4 hours ago










1 Answer
1






active

oldest

votes

















up vote
9
down vote













I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]



"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






share|improve this answer




















  • I would add that "dictionary order" is easier to understand with examples like OrderedQ["ac", "Ab"] instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
    – Michael E2
    5 hours ago










  • I don't understand the plain-English statement "It seems that Sort treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
    – Mr.Wizard♦
    1 hour ago










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "387"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f183473%2forderq-with-string-arguments%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
9
down vote













I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]



"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






share|improve this answer




















  • I would add that "dictionary order" is easier to understand with examples like OrderedQ["ac", "Ab"] instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
    – Michael E2
    5 hours ago










  • I don't understand the plain-English statement "It seems that Sort treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
    – Mr.Wizard♦
    1 hour ago














up vote
9
down vote













I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]



"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






share|improve this answer




















  • I would add that "dictionary order" is easier to understand with examples like OrderedQ["ac", "Ab"] instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
    – Michael E2
    5 hours ago










  • I don't understand the plain-English statement "It seems that Sort treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
    – Mr.Wizard♦
    1 hour ago












up vote
9
down vote










up vote
9
down vote









I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]



"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






share|improve this answer












I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]



"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.







share|improve this answer












share|improve this answer



share|improve this answer










answered 7 hours ago









Carl Woll

60.3k279155




60.3k279155











  • I would add that "dictionary order" is easier to understand with examples like OrderedQ["ac", "Ab"] instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
    – Michael E2
    5 hours ago










  • I don't understand the plain-English statement "It seems that Sort treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
    – Mr.Wizard♦
    1 hour ago
















  • I would add that "dictionary order" is easier to understand with examples like OrderedQ["ac", "Ab"] instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
    – Michael E2
    5 hours ago










  • I don't understand the plain-English statement "It seems that Sort treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
    – Mr.Wizard♦
    1 hour ago















I would add that "dictionary order" is easier to understand with examples like OrderedQ["ac", "Ab"] instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
– Michael E2
5 hours ago




I would add that "dictionary order" is easier to understand with examples like OrderedQ["ac", "Ab"] instead of the ones and twos. Also punctuation characters are treated differently (and different from my print dictionary) than letter-like characters, to which class digits apparently belong.
– Michael E2
5 hours ago












I don't understand the plain-English statement "It seems that Sort treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
– Mr.Wizard♦
1 hour ago




I don't understand the plain-English statement "It seems that Sort treats 'a' and 'A' equivalently" -- it does not in the most natural interpretation (to me) agree with the example below. Perhaps you could try phrasing that in a different way?
– Mr.Wizard♦
1 hour ago

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f183473%2forderq-with-string-arguments%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

Long meetings (6-7 hours a day): Being “babysat” by supervisor

Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

Confectionery