Ordering a string by the count of substrings?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite












I have long list of numbers like this:



1234-212-22-11153782-0114232192380
8807698823332-6756-234-14-09867378
45323-14-221-238372635363-43676256
62736373-9983-23-234-8863345637388

. . . .
. . . .


I would like to do two things:



1) order this list by the count of digits within each segment, the output should be like this:



22-212-1234-11153782-0114232192380
14-234-6756-09867378-8807698823332
14-221-45323-43676256-238372635363
23-234-9983-62736373-8863345637388


2) find the count of sub strings in each line, the output should be:



2-3-4-8-13
2-3-4-8-13
2-3-5-8-12
2-3-4-8-13


In this example the first, second and third segments of each number has same numbers, but they could be different.










share|improve this question









New contributor




marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • are fields of identical length possible? How should these be arranged?
    – RudiC
    1 hour ago










  • This looks like homework. Are you allowed to use something besides bash, e.g. python?
    – Hermann
    1 hour ago










  • You mention Linux; can we assume a GNU/Linux environment for solutions?
    – Jeff Schaller
    1 hour ago










  • Not sure why this would be homework. Could be log output from a game or a piece of custom hardware or anything really.
    – pipe
    10 mins ago










  • @marco. Supper nice question!! :-)
    – Goro
    3 mins ago















up vote
4
down vote

favorite












I have long list of numbers like this:



1234-212-22-11153782-0114232192380
8807698823332-6756-234-14-09867378
45323-14-221-238372635363-43676256
62736373-9983-23-234-8863345637388

. . . .
. . . .


I would like to do two things:



1) order this list by the count of digits within each segment, the output should be like this:



22-212-1234-11153782-0114232192380
14-234-6756-09867378-8807698823332
14-221-45323-43676256-238372635363
23-234-9983-62736373-8863345637388


2) find the count of sub strings in each line, the output should be:



2-3-4-8-13
2-3-4-8-13
2-3-5-8-12
2-3-4-8-13


In this example the first, second and third segments of each number has same numbers, but they could be different.










share|improve this question









New contributor




marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • are fields of identical length possible? How should these be arranged?
    – RudiC
    1 hour ago










  • This looks like homework. Are you allowed to use something besides bash, e.g. python?
    – Hermann
    1 hour ago










  • You mention Linux; can we assume a GNU/Linux environment for solutions?
    – Jeff Schaller
    1 hour ago










  • Not sure why this would be homework. Could be log output from a game or a piece of custom hardware or anything really.
    – pipe
    10 mins ago










  • @marco. Supper nice question!! :-)
    – Goro
    3 mins ago













up vote
4
down vote

favorite









up vote
4
down vote

favorite











I have long list of numbers like this:



1234-212-22-11153782-0114232192380
8807698823332-6756-234-14-09867378
45323-14-221-238372635363-43676256
62736373-9983-23-234-8863345637388

. . . .
. . . .


I would like to do two things:



1) order this list by the count of digits within each segment, the output should be like this:



22-212-1234-11153782-0114232192380
14-234-6756-09867378-8807698823332
14-221-45323-43676256-238372635363
23-234-9983-62736373-8863345637388


2) find the count of sub strings in each line, the output should be:



2-3-4-8-13
2-3-4-8-13
2-3-5-8-12
2-3-4-8-13


In this example the first, second and third segments of each number has same numbers, but they could be different.










share|improve this question









New contributor




marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I have long list of numbers like this:



1234-212-22-11153782-0114232192380
8807698823332-6756-234-14-09867378
45323-14-221-238372635363-43676256
62736373-9983-23-234-8863345637388

. . . .
. . . .


I would like to do two things:



1) order this list by the count of digits within each segment, the output should be like this:



22-212-1234-11153782-0114232192380
14-234-6756-09867378-8807698823332
14-221-45323-43676256-238372635363
23-234-9983-62736373-8863345637388


2) find the count of sub strings in each line, the output should be:



2-3-4-8-13
2-3-4-8-13
2-3-5-8-12
2-3-4-8-13


In this example the first, second and third segments of each number has same numbers, but they could be different.







text-processing sort






share|improve this question









New contributor




marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 4 mins ago









Goro

6,16452763




6,16452763






New contributor




marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 1 hour ago









marco

513




513




New contributor




marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






marco is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • are fields of identical length possible? How should these be arranged?
    – RudiC
    1 hour ago










  • This looks like homework. Are you allowed to use something besides bash, e.g. python?
    – Hermann
    1 hour ago










  • You mention Linux; can we assume a GNU/Linux environment for solutions?
    – Jeff Schaller
    1 hour ago










  • Not sure why this would be homework. Could be log output from a game or a piece of custom hardware or anything really.
    – pipe
    10 mins ago










  • @marco. Supper nice question!! :-)
    – Goro
    3 mins ago

















  • are fields of identical length possible? How should these be arranged?
    – RudiC
    1 hour ago










  • This looks like homework. Are you allowed to use something besides bash, e.g. python?
    – Hermann
    1 hour ago










  • You mention Linux; can we assume a GNU/Linux environment for solutions?
    – Jeff Schaller
    1 hour ago










  • Not sure why this would be homework. Could be log output from a game or a piece of custom hardware or anything really.
    – pipe
    10 mins ago










  • @marco. Supper nice question!! :-)
    – Goro
    3 mins ago
















are fields of identical length possible? How should these be arranged?
– RudiC
1 hour ago




are fields of identical length possible? How should these be arranged?
– RudiC
1 hour ago












This looks like homework. Are you allowed to use something besides bash, e.g. python?
– Hermann
1 hour ago




This looks like homework. Are you allowed to use something besides bash, e.g. python?
– Hermann
1 hour ago












You mention Linux; can we assume a GNU/Linux environment for solutions?
– Jeff Schaller
1 hour ago




You mention Linux; can we assume a GNU/Linux environment for solutions?
– Jeff Schaller
1 hour ago












Not sure why this would be homework. Could be log output from a game or a piece of custom hardware or anything really.
– pipe
10 mins ago




Not sure why this would be homework. Could be log output from a game or a piece of custom hardware or anything really.
– pipe
10 mins ago












@marco. Supper nice question!! :-)
– Goro
3 mins ago





@marco. Supper nice question!! :-)
– Goro
3 mins ago











3 Answers
3






active

oldest

votes

















up vote
6
down vote













How about



$ perl -F'-' -lne 'print join "-", sort length $a <=> length $b @F' file
22-212-1234-11153782-0114232192380
14-234-6756-09867378-8807698823332
14-221-45323-43676256-238372635363
23-234-9983-62736373-8863345637388


and



$ perl -F'-' -lne 'print join "-", map length $_ sort length $a <=> length $b @F' file
2-3-4-8-13
2-3-4-8-13
2-3-5-8-12
2-3-4-8-13





share|improve this answer



























    up vote
    4
    down vote













    GNU awk can sort, so the trickiest part is deciding how to separate the two desired outputs; this script generates both results, and you can decide if you'd like them somewhere other than hard-coded output files:



    function compare_length(i1, v1, i2, v2) 
    return (length(v1) - length(v2));


    BEGIN
    PROCINFO["sorted_in"]="compare_length"
    FS="-"



    split($0, elements);
    asort(elements, sorted_elements, "compare_length");
    reordered="";
    lengths="";
    for (element in sorted_elements)
    reordered=(reordered == "" ? "" : reordered FS) sorted_elements[element];
    lengths=(lengths == "" ? "" : lengths FS) length(sorted_elements[element]);

    print reordered > "reordered.out";
    print lengths > "lengths.out";






    share|improve this answer



























      up vote
      2
      down vote













      How far would this get you:



      awk -F- ' # set "-" as the field separator

      for (i=1; i<=NF; i++)
      L = length($i) # for every single field, calc its length
      T[L] = $i # and populate the T array with length as index
      if (L>MX) MX = L # keep max length

      $0 = "" # empty line
      for (i=1; i<=MX; i++)
      if (T[i])
      $0 = $0 OFS T[i] # append each non-zero T element to the line, separated by "-"
      C = C OFS i # keep the field lengths in separate variable C


      print substr ($0, 2) "t" substr (C, 2) # print the line and the field lengths, eliminating each first char
      C = MX = "" # reset working variables
      split ("", T) # delete T array

      ' OFS=- file
      22-212-1234-11153782-0114232192380 2-3-4-8-13
      14-234-6756-09867378-8807698823332 2-3-4-8-13
      14-221-45323-43676256-238372635363 2-3-5-8-12
      23-234-9983-62736373-8863345637388 2-3-4-8-13


      You may want to split the printout into two result files.






      share|improve this answer






















        Your Answer







        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "106"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        convertImagesToLinks: false,
        noModals: false,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );






        marco is a new contributor. Be nice, and check out our Code of Conduct.









         

        draft saved


        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472976%2fordering-a-string-by-the-count-of-substrings%23new-answer', 'question_page');

        );

        Post as a guest






























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        6
        down vote













        How about



        $ perl -F'-' -lne 'print join "-", sort length $a <=> length $b @F' file
        22-212-1234-11153782-0114232192380
        14-234-6756-09867378-8807698823332
        14-221-45323-43676256-238372635363
        23-234-9983-62736373-8863345637388


        and



        $ perl -F'-' -lne 'print join "-", map length $_ sort length $a <=> length $b @F' file
        2-3-4-8-13
        2-3-4-8-13
        2-3-5-8-12
        2-3-4-8-13





        share|improve this answer
























          up vote
          6
          down vote













          How about



          $ perl -F'-' -lne 'print join "-", sort length $a <=> length $b @F' file
          22-212-1234-11153782-0114232192380
          14-234-6756-09867378-8807698823332
          14-221-45323-43676256-238372635363
          23-234-9983-62736373-8863345637388


          and



          $ perl -F'-' -lne 'print join "-", map length $_ sort length $a <=> length $b @F' file
          2-3-4-8-13
          2-3-4-8-13
          2-3-5-8-12
          2-3-4-8-13





          share|improve this answer






















            up vote
            6
            down vote










            up vote
            6
            down vote









            How about



            $ perl -F'-' -lne 'print join "-", sort length $a <=> length $b @F' file
            22-212-1234-11153782-0114232192380
            14-234-6756-09867378-8807698823332
            14-221-45323-43676256-238372635363
            23-234-9983-62736373-8863345637388


            and



            $ perl -F'-' -lne 'print join "-", map length $_ sort length $a <=> length $b @F' file
            2-3-4-8-13
            2-3-4-8-13
            2-3-5-8-12
            2-3-4-8-13





            share|improve this answer












            How about



            $ perl -F'-' -lne 'print join "-", sort length $a <=> length $b @F' file
            22-212-1234-11153782-0114232192380
            14-234-6756-09867378-8807698823332
            14-221-45323-43676256-238372635363
            23-234-9983-62736373-8863345637388


            and



            $ perl -F'-' -lne 'print join "-", map length $_ sort length $a <=> length $b @F' file
            2-3-4-8-13
            2-3-4-8-13
            2-3-5-8-12
            2-3-4-8-13






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered 37 mins ago









            steeldriver

            32.4k34979




            32.4k34979






















                up vote
                4
                down vote













                GNU awk can sort, so the trickiest part is deciding how to separate the two desired outputs; this script generates both results, and you can decide if you'd like them somewhere other than hard-coded output files:



                function compare_length(i1, v1, i2, v2) 
                return (length(v1) - length(v2));


                BEGIN
                PROCINFO["sorted_in"]="compare_length"
                FS="-"



                split($0, elements);
                asort(elements, sorted_elements, "compare_length");
                reordered="";
                lengths="";
                for (element in sorted_elements)
                reordered=(reordered == "" ? "" : reordered FS) sorted_elements[element];
                lengths=(lengths == "" ? "" : lengths FS) length(sorted_elements[element]);

                print reordered > "reordered.out";
                print lengths > "lengths.out";






                share|improve this answer
























                  up vote
                  4
                  down vote













                  GNU awk can sort, so the trickiest part is deciding how to separate the two desired outputs; this script generates both results, and you can decide if you'd like them somewhere other than hard-coded output files:



                  function compare_length(i1, v1, i2, v2) 
                  return (length(v1) - length(v2));


                  BEGIN
                  PROCINFO["sorted_in"]="compare_length"
                  FS="-"



                  split($0, elements);
                  asort(elements, sorted_elements, "compare_length");
                  reordered="";
                  lengths="";
                  for (element in sorted_elements)
                  reordered=(reordered == "" ? "" : reordered FS) sorted_elements[element];
                  lengths=(lengths == "" ? "" : lengths FS) length(sorted_elements[element]);

                  print reordered > "reordered.out";
                  print lengths > "lengths.out";






                  share|improve this answer






















                    up vote
                    4
                    down vote










                    up vote
                    4
                    down vote









                    GNU awk can sort, so the trickiest part is deciding how to separate the two desired outputs; this script generates both results, and you can decide if you'd like them somewhere other than hard-coded output files:



                    function compare_length(i1, v1, i2, v2) 
                    return (length(v1) - length(v2));


                    BEGIN
                    PROCINFO["sorted_in"]="compare_length"
                    FS="-"



                    split($0, elements);
                    asort(elements, sorted_elements, "compare_length");
                    reordered="";
                    lengths="";
                    for (element in sorted_elements)
                    reordered=(reordered == "" ? "" : reordered FS) sorted_elements[element];
                    lengths=(lengths == "" ? "" : lengths FS) length(sorted_elements[element]);

                    print reordered > "reordered.out";
                    print lengths > "lengths.out";






                    share|improve this answer












                    GNU awk can sort, so the trickiest part is deciding how to separate the two desired outputs; this script generates both results, and you can decide if you'd like them somewhere other than hard-coded output files:



                    function compare_length(i1, v1, i2, v2) 
                    return (length(v1) - length(v2));


                    BEGIN
                    PROCINFO["sorted_in"]="compare_length"
                    FS="-"



                    split($0, elements);
                    asort(elements, sorted_elements, "compare_length");
                    reordered="";
                    lengths="";
                    for (element in sorted_elements)
                    reordered=(reordered == "" ? "" : reordered FS) sorted_elements[element];
                    lengths=(lengths == "" ? "" : lengths FS) length(sorted_elements[element]);

                    print reordered > "reordered.out";
                    print lengths > "lengths.out";







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered 33 mins ago









                    Jeff Schaller

                    33.2k849111




                    33.2k849111




















                        up vote
                        2
                        down vote













                        How far would this get you:



                        awk -F- ' # set "-" as the field separator

                        for (i=1; i<=NF; i++)
                        L = length($i) # for every single field, calc its length
                        T[L] = $i # and populate the T array with length as index
                        if (L>MX) MX = L # keep max length

                        $0 = "" # empty line
                        for (i=1; i<=MX; i++)
                        if (T[i])
                        $0 = $0 OFS T[i] # append each non-zero T element to the line, separated by "-"
                        C = C OFS i # keep the field lengths in separate variable C


                        print substr ($0, 2) "t" substr (C, 2) # print the line and the field lengths, eliminating each first char
                        C = MX = "" # reset working variables
                        split ("", T) # delete T array

                        ' OFS=- file
                        22-212-1234-11153782-0114232192380 2-3-4-8-13
                        14-234-6756-09867378-8807698823332 2-3-4-8-13
                        14-221-45323-43676256-238372635363 2-3-5-8-12
                        23-234-9983-62736373-8863345637388 2-3-4-8-13


                        You may want to split the printout into two result files.






                        share|improve this answer


























                          up vote
                          2
                          down vote













                          How far would this get you:



                          awk -F- ' # set "-" as the field separator

                          for (i=1; i<=NF; i++)
                          L = length($i) # for every single field, calc its length
                          T[L] = $i # and populate the T array with length as index
                          if (L>MX) MX = L # keep max length

                          $0 = "" # empty line
                          for (i=1; i<=MX; i++)
                          if (T[i])
                          $0 = $0 OFS T[i] # append each non-zero T element to the line, separated by "-"
                          C = C OFS i # keep the field lengths in separate variable C


                          print substr ($0, 2) "t" substr (C, 2) # print the line and the field lengths, eliminating each first char
                          C = MX = "" # reset working variables
                          split ("", T) # delete T array

                          ' OFS=- file
                          22-212-1234-11153782-0114232192380 2-3-4-8-13
                          14-234-6756-09867378-8807698823332 2-3-4-8-13
                          14-221-45323-43676256-238372635363 2-3-5-8-12
                          23-234-9983-62736373-8863345637388 2-3-4-8-13


                          You may want to split the printout into two result files.






                          share|improve this answer
























                            up vote
                            2
                            down vote










                            up vote
                            2
                            down vote









                            How far would this get you:



                            awk -F- ' # set "-" as the field separator

                            for (i=1; i<=NF; i++)
                            L = length($i) # for every single field, calc its length
                            T[L] = $i # and populate the T array with length as index
                            if (L>MX) MX = L # keep max length

                            $0 = "" # empty line
                            for (i=1; i<=MX; i++)
                            if (T[i])
                            $0 = $0 OFS T[i] # append each non-zero T element to the line, separated by "-"
                            C = C OFS i # keep the field lengths in separate variable C


                            print substr ($0, 2) "t" substr (C, 2) # print the line and the field lengths, eliminating each first char
                            C = MX = "" # reset working variables
                            split ("", T) # delete T array

                            ' OFS=- file
                            22-212-1234-11153782-0114232192380 2-3-4-8-13
                            14-234-6756-09867378-8807698823332 2-3-4-8-13
                            14-221-45323-43676256-238372635363 2-3-5-8-12
                            23-234-9983-62736373-8863345637388 2-3-4-8-13


                            You may want to split the printout into two result files.






                            share|improve this answer














                            How far would this get you:



                            awk -F- ' # set "-" as the field separator

                            for (i=1; i<=NF; i++)
                            L = length($i) # for every single field, calc its length
                            T[L] = $i # and populate the T array with length as index
                            if (L>MX) MX = L # keep max length

                            $0 = "" # empty line
                            for (i=1; i<=MX; i++)
                            if (T[i])
                            $0 = $0 OFS T[i] # append each non-zero T element to the line, separated by "-"
                            C = C OFS i # keep the field lengths in separate variable C


                            print substr ($0, 2) "t" substr (C, 2) # print the line and the field lengths, eliminating each first char
                            C = MX = "" # reset working variables
                            split ("", T) # delete T array

                            ' OFS=- file
                            22-212-1234-11153782-0114232192380 2-3-4-8-13
                            14-234-6756-09867378-8807698823332 2-3-4-8-13
                            14-221-45323-43676256-238372635363 2-3-5-8-12
                            23-234-9983-62736373-8863345637388 2-3-4-8-13


                            You may want to split the printout into two result files.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited 22 mins ago









                            terdon♦

                            124k29234408




                            124k29234408










                            answered 51 mins ago









                            RudiC

                            1,6749




                            1,6749




















                                marco is a new contributor. Be nice, and check out our Code of Conduct.









                                 

                                draft saved


                                draft discarded


















                                marco is a new contributor. Be nice, and check out our Code of Conduct.












                                marco is a new contributor. Be nice, and check out our Code of Conduct.











                                marco is a new contributor. Be nice, and check out our Code of Conduct.













                                 


                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472976%2fordering-a-string-by-the-count-of-substrings%23new-answer', 'question_page');

                                );

                                Post as a guest













































































                                Comments

                                Popular posts from this blog

                                What does second last employer means? [closed]

                                List of Gilmore Girls characters

                                Confectionery