/(.+)n1/ works but /(.*)n1/ doesn't when they should both work

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite
1












I was playing around with sed after answering another question and I noticed that .+ and .* are not giving the same result when they both match multiple characters in a context address.

The following command1:



sed -E '$!N;/(.+)n1/!P;D' <<IN
one
one_more
two
two_more
IN


prints



one_more
two_more


OK, that's the expected output.

Changing the regex from .+ to .* (i.e. from one or more characters to zero or more characters) should give the same result but it does not:



sed -E '$!N;/(.*)n1/!P;D' <<IN
one
one_more
two
two_more
IN


prints just one line



two_more


What's going on here ?




1: I'm using ERE for simplicity/readability, the same happens when using BRE










share|improve this question



























    up vote
    4
    down vote

    favorite
    1












    I was playing around with sed after answering another question and I noticed that .+ and .* are not giving the same result when they both match multiple characters in a context address.

    The following command1:



    sed -E '$!N;/(.+)n1/!P;D' <<IN
    one
    one_more
    two
    two_more
    IN


    prints



    one_more
    two_more


    OK, that's the expected output.

    Changing the regex from .+ to .* (i.e. from one or more characters to zero or more characters) should give the same result but it does not:



    sed -E '$!N;/(.*)n1/!P;D' <<IN
    one
    one_more
    two
    two_more
    IN


    prints just one line



    two_more


    What's going on here ?




    1: I'm using ERE for simplicity/readability, the same happens when using BRE










    share|improve this question

























      up vote
      4
      down vote

      favorite
      1









      up vote
      4
      down vote

      favorite
      1






      1





      I was playing around with sed after answering another question and I noticed that .+ and .* are not giving the same result when they both match multiple characters in a context address.

      The following command1:



      sed -E '$!N;/(.+)n1/!P;D' <<IN
      one
      one_more
      two
      two_more
      IN


      prints



      one_more
      two_more


      OK, that's the expected output.

      Changing the regex from .+ to .* (i.e. from one or more characters to zero or more characters) should give the same result but it does not:



      sed -E '$!N;/(.*)n1/!P;D' <<IN
      one
      one_more
      two
      two_more
      IN


      prints just one line



      two_more


      What's going on here ?




      1: I'm using ERE for simplicity/readability, the same happens when using BRE










      share|improve this question















      I was playing around with sed after answering another question and I noticed that .+ and .* are not giving the same result when they both match multiple characters in a context address.

      The following command1:



      sed -E '$!N;/(.+)n1/!P;D' <<IN
      one
      one_more
      two
      two_more
      IN


      prints



      one_more
      two_more


      OK, that's the expected output.

      Changing the regex from .+ to .* (i.e. from one or more characters to zero or more characters) should give the same result but it does not:



      sed -E '$!N;/(.*)n1/!P;D' <<IN
      one
      one_more
      two
      two_more
      IN


      prints just one line



      two_more


      What's going on here ?




      1: I'm using ERE for simplicity/readability, the same happens when using BRE







      sed regular-expression






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 1 hour ago

























      asked 3 hours ago









      don_crissti

      48.2k15127157




      48.2k15127157




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          5
          down vote



          accepted










          That happens because /(.*)n1/ is also matching a simple newline (n: empty string, followed by newline, followed by the same empty string from the beginning).



          So it will also match the string one_morentwo from your example.



          To avoid that, you'll have to anchor your regexps, e.g. sed -E '$!N;/^(.+)n1/!P;D' or sed -E '$!N;/^(.*)n1/!P;D'.






          share|improve this answer






















          • But wouldn't that contradict greediness?
            – RudiC
            2 hours ago










          • No. It's simply that "one_morentwo" will match /(.*)n1/, whether greedy or not. To understand what happens in that sed script, add a debugging command after the N eg: sed -E '$!N; h;s/(.; /(.*)n1/!P; D' ...
            – mosvy
            2 hours ago











          • @RudiC - greedy means the regex engine attempts to match as much as possible; in this particular case ((.*)n1) nothing (the empty string) followed by newline and again by nothing matches and that is sufficient.
            – don_crissti
            1 hour ago











          • @RudiC one_morentwo doesn't match /(.+)n1/ at all (since there's no non-empty string that appears on both sides of the NL), but it does match /(.*)n1/. Then again, one_moreneno would match /(.+)n1/ too, so perhaps the anchors are a good idea.
            – ilkkachu
            1 hour ago










          Your Answer







          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f477595%2fn-1-works-but-n-1-doesnt-when-they-should-both-work%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          5
          down vote



          accepted










          That happens because /(.*)n1/ is also matching a simple newline (n: empty string, followed by newline, followed by the same empty string from the beginning).



          So it will also match the string one_morentwo from your example.



          To avoid that, you'll have to anchor your regexps, e.g. sed -E '$!N;/^(.+)n1/!P;D' or sed -E '$!N;/^(.*)n1/!P;D'.






          share|improve this answer






















          • But wouldn't that contradict greediness?
            – RudiC
            2 hours ago










          • No. It's simply that "one_morentwo" will match /(.*)n1/, whether greedy or not. To understand what happens in that sed script, add a debugging command after the N eg: sed -E '$!N; h;s/(.; /(.*)n1/!P; D' ...
            – mosvy
            2 hours ago











          • @RudiC - greedy means the regex engine attempts to match as much as possible; in this particular case ((.*)n1) nothing (the empty string) followed by newline and again by nothing matches and that is sufficient.
            – don_crissti
            1 hour ago











          • @RudiC one_morentwo doesn't match /(.+)n1/ at all (since there's no non-empty string that appears on both sides of the NL), but it does match /(.*)n1/. Then again, one_moreneno would match /(.+)n1/ too, so perhaps the anchors are a good idea.
            – ilkkachu
            1 hour ago














          up vote
          5
          down vote



          accepted










          That happens because /(.*)n1/ is also matching a simple newline (n: empty string, followed by newline, followed by the same empty string from the beginning).



          So it will also match the string one_morentwo from your example.



          To avoid that, you'll have to anchor your regexps, e.g. sed -E '$!N;/^(.+)n1/!P;D' or sed -E '$!N;/^(.*)n1/!P;D'.






          share|improve this answer






















          • But wouldn't that contradict greediness?
            – RudiC
            2 hours ago










          • No. It's simply that "one_morentwo" will match /(.*)n1/, whether greedy or not. To understand what happens in that sed script, add a debugging command after the N eg: sed -E '$!N; h;s/(.; /(.*)n1/!P; D' ...
            – mosvy
            2 hours ago











          • @RudiC - greedy means the regex engine attempts to match as much as possible; in this particular case ((.*)n1) nothing (the empty string) followed by newline and again by nothing matches and that is sufficient.
            – don_crissti
            1 hour ago











          • @RudiC one_morentwo doesn't match /(.+)n1/ at all (since there's no non-empty string that appears on both sides of the NL), but it does match /(.*)n1/. Then again, one_moreneno would match /(.+)n1/ too, so perhaps the anchors are a good idea.
            – ilkkachu
            1 hour ago












          up vote
          5
          down vote



          accepted







          up vote
          5
          down vote



          accepted






          That happens because /(.*)n1/ is also matching a simple newline (n: empty string, followed by newline, followed by the same empty string from the beginning).



          So it will also match the string one_morentwo from your example.



          To avoid that, you'll have to anchor your regexps, e.g. sed -E '$!N;/^(.+)n1/!P;D' or sed -E '$!N;/^(.*)n1/!P;D'.






          share|improve this answer














          That happens because /(.*)n1/ is also matching a simple newline (n: empty string, followed by newline, followed by the same empty string from the beginning).



          So it will also match the string one_morentwo from your example.



          To avoid that, you'll have to anchor your regexps, e.g. sed -E '$!N;/^(.+)n1/!P;D' or sed -E '$!N;/^(.*)n1/!P;D'.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 hours ago









          don_crissti

          48.2k15127157




          48.2k15127157










          answered 2 hours ago









          mosvy

          3,115118




          3,115118











          • But wouldn't that contradict greediness?
            – RudiC
            2 hours ago










          • No. It's simply that "one_morentwo" will match /(.*)n1/, whether greedy or not. To understand what happens in that sed script, add a debugging command after the N eg: sed -E '$!N; h;s/(.; /(.*)n1/!P; D' ...
            – mosvy
            2 hours ago











          • @RudiC - greedy means the regex engine attempts to match as much as possible; in this particular case ((.*)n1) nothing (the empty string) followed by newline and again by nothing matches and that is sufficient.
            – don_crissti
            1 hour ago











          • @RudiC one_morentwo doesn't match /(.+)n1/ at all (since there's no non-empty string that appears on both sides of the NL), but it does match /(.*)n1/. Then again, one_moreneno would match /(.+)n1/ too, so perhaps the anchors are a good idea.
            – ilkkachu
            1 hour ago
















          • But wouldn't that contradict greediness?
            – RudiC
            2 hours ago










          • No. It's simply that "one_morentwo" will match /(.*)n1/, whether greedy or not. To understand what happens in that sed script, add a debugging command after the N eg: sed -E '$!N; h;s/(.; /(.*)n1/!P; D' ...
            – mosvy
            2 hours ago











          • @RudiC - greedy means the regex engine attempts to match as much as possible; in this particular case ((.*)n1) nothing (the empty string) followed by newline and again by nothing matches and that is sufficient.
            – don_crissti
            1 hour ago











          • @RudiC one_morentwo doesn't match /(.+)n1/ at all (since there's no non-empty string that appears on both sides of the NL), but it does match /(.*)n1/. Then again, one_moreneno would match /(.+)n1/ too, so perhaps the anchors are a good idea.
            – ilkkachu
            1 hour ago















          But wouldn't that contradict greediness?
          – RudiC
          2 hours ago




          But wouldn't that contradict greediness?
          – RudiC
          2 hours ago












          No. It's simply that "one_morentwo" will match /(.*)n1/, whether greedy or not. To understand what happens in that sed script, add a debugging command after the N eg: sed -E '$!N; h;s/(.; /(.*)n1/!P; D' ...
          – mosvy
          2 hours ago





          No. It's simply that "one_morentwo" will match /(.*)n1/, whether greedy or not. To understand what happens in that sed script, add a debugging command after the N eg: sed -E '$!N; h;s/(.; /(.*)n1/!P; D' ...
          – mosvy
          2 hours ago













          @RudiC - greedy means the regex engine attempts to match as much as possible; in this particular case ((.*)n1) nothing (the empty string) followed by newline and again by nothing matches and that is sufficient.
          – don_crissti
          1 hour ago





          @RudiC - greedy means the regex engine attempts to match as much as possible; in this particular case ((.*)n1) nothing (the empty string) followed by newline and again by nothing matches and that is sufficient.
          – don_crissti
          1 hour ago













          @RudiC one_morentwo doesn't match /(.+)n1/ at all (since there's no non-empty string that appears on both sides of the NL), but it does match /(.*)n1/. Then again, one_moreneno would match /(.+)n1/ too, so perhaps the anchors are a good idea.
          – ilkkachu
          1 hour ago




          @RudiC one_morentwo doesn't match /(.+)n1/ at all (since there's no non-empty string that appears on both sides of the NL), but it does match /(.*)n1/. Then again, one_moreneno would match /(.+)n1/ too, so perhaps the anchors are a good idea.
          – ilkkachu
          1 hour ago

















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f477595%2fn-1-works-but-n-1-doesnt-when-they-should-both-work%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          Long meetings (6-7 hours a day): Being “babysat” by supervisor

          Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

          Confectionery