Ordering of String

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
5
down vote

favorite
1












I don't find these results coherent :



OrderedQ["a","A"]



True




OrderedQ["a2","A1"]



False




Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).



One could think that the explanation is :



  • "a" and "A" are equivalent


  • As OrderedQ["A","A"] returns True


it's normal.



But in that case OrderedQ["A","a"] shouldn't return False.










share|improve this question



















  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    3 hours ago







  • 1




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    38 mins ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    35 mins ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    31 mins ago














up vote
5
down vote

favorite
1












I don't find these results coherent :



OrderedQ["a","A"]



True




OrderedQ["a2","A1"]



False




Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).



One could think that the explanation is :



  • "a" and "A" are equivalent


  • As OrderedQ["A","A"] returns True


it's normal.



But in that case OrderedQ["A","a"] shouldn't return False.










share|improve this question



















  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    3 hours ago







  • 1




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    38 mins ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    35 mins ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    31 mins ago












up vote
5
down vote

favorite
1









up vote
5
down vote

favorite
1






1





I don't find these results coherent :



OrderedQ["a","A"]



True




OrderedQ["a2","A1"]



False




Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).



One could think that the explanation is :



  • "a" and "A" are equivalent


  • As OrderedQ["A","A"] returns True


it's normal.



But in that case OrderedQ["A","a"] shouldn't return False.










share|improve this question















I don't find these results coherent :



OrderedQ["a","A"]



True




OrderedQ["a2","A1"]



False




Is there any explanation of that somewhere ? (in fact not necessarily related to Mathematica, maybe there are some standards or established conventions about this).



One could think that the explanation is :



  • "a" and "A" are equivalent


  • As OrderedQ["A","A"] returns True


it's normal.



But in that case OrderedQ["A","a"] shouldn't return False.







string-manipulation sorting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 1 hour ago

























asked 3 hours ago









andre

11.5k12248




11.5k12248







  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    3 hours ago







  • 1




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    38 mins ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    35 mins ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    31 mins ago












  • 3




    OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
    – kglr
    3 hours ago







  • 1




    @kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
    – Szabolcs
    38 mins ago






  • 1




    Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
    – Szabolcs
    35 mins ago











  • Example: Sort["I2", "Ä°2", "i2", "ı2", "I1", "Ä°1", "i1", "ı1"] --> "Ä°1", "Ä°2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
    – Szabolcs
    31 mins ago







3




3




OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
– kglr
3 hours ago





OrderedQ >> Details: By default, OrderedQ uses canonical order as described in the notes for Sort. _ and Sort >> Details: _Sort orders strings as in a dictionary, with uppercase versions of letters coming after lowercase ones.
– kglr
3 hours ago





1




1




@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
– Szabolcs
38 mins ago




@kglr This behaviour is quite puzzling to me because the ordering is clearly not lexicographic. By "lexicographic" I mean an ordering of strings of tokens that is based on an ordering of the tokens themselves. I.e., with pseudocode notation, string[1] > string2[1] implies that string1 > string2. This is clearly not the case here. Mathematica uses some more complex and more confusing ordering.
– Szabolcs
38 mins ago




1




1




Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
– Szabolcs
35 mins ago





Also consider OrderedQ["a", "2", "A", "1"] --> True. If upper/lowecase is treated specially, does that mean that: (1) ordering is language dependent (consider Turkish dotted uppercase İ and dotless lowercase ı)? If yes, what language does M use? (2) ordering is not well-defined for certain scripts?
– Szabolcs
35 mins ago













Example: Sort["I2", "İ2", "i2", "ı2", "I1", "İ1", "i1", "ı1"] --> "İ1", "İ2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
– Szabolcs
31 mins ago




Example: Sort["I2", "İ2", "i2", "ı2", "I1", "İ1", "i1", "ı1"] --> "İ1", "İ2", "i1", "ı1", "I1", "i2", "ı2", "I2". My point is that the note in the documentation does not give an unambiguous description of what is going on.
– Szabolcs
31 mins ago










1 Answer
1






active

oldest

votes

















up vote
4
down vote













I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



Sort[
"a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
"a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
]



"a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
"a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






share|improve this answer




















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "387"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f183473%2fordering-of-string%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    4
    down vote













    I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



    Sort[
    "a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
    "a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
    ]



    "a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
    "a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




    Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






    share|improve this answer
























      up vote
      4
      down vote













      I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



      Sort[
      "a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
      "a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
      ]



      "a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
      "a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




      Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






      share|improve this answer






















        up vote
        4
        down vote










        up vote
        4
        down vote









        I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



        Sort[
        "a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
        "a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
        ]



        "a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
        "a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




        Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.






        share|improve this answer












        I think you have a good question. It seems that Sort treats "a" and "A" equivalently, and then sorts the elements that are equivalent. Here is an example that perhaps clarifies the issue:



        Sort[
        "a2a1","A1a1","A2a1","a1a1","a2A1","A1A1","A2A1","a1A1",
        "a2a2","A1a2","A2a2","a1a2","a2A2","A1A2","A2A2","a1A2"
        ]



        "a1a1", "a1A1", "A1a1", "A1A1", "a1a2", "a1A2", "A1a2", "A1A2", "a2a1",
        "a2A1", "A2a1", "A2A1", "a2a2", "a2A2", "A2a2", "A2A2"




        Notice how the x1x1 terms come first, then the x1x2 terms, etc. Within each grouping the sort of the equivalent letters goes as "aa", "aA", "Aa", "AA" as expected.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 1 hour ago









        Carl Woll

        60.3k279155




        60.3k279155



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f183473%2fordering-of-string%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            Long meetings (6-7 hours a day): Being “babysat” by supervisor

            Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

            Confectionery