Why “vectorizing” this simple R loop gives wrong result?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
7
down vote

favorite
2












Perhaps a very dumb question.



I am trying to "vectorize" the following loop:



set.seed(0)
x <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x[i] <- x[sig[i]]
x
# [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


I think it is simply x[sig] but the result does not match.



set.seed(0)
x <- round(runif(10), 2)
x[sig]
# [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


What's wrong?










share|improve this question



























    up vote
    7
    down vote

    favorite
    2












    Perhaps a very dumb question.



    I am trying to "vectorize" the following loop:



    set.seed(0)
    x <- round(runif(10), 2)
    # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
    sig <- sample.int(10)
    # [1] 1 2 9 5 3 4 8 6 7 10
    for (i in seq_along(sig)) x[i] <- x[sig[i]]
    x
    # [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


    I think it is simply x[sig] but the result does not match.



    set.seed(0)
    x <- round(runif(10), 2)
    x[sig]
    # [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


    What's wrong?










    share|improve this question

























      up vote
      7
      down vote

      favorite
      2









      up vote
      7
      down vote

      favorite
      2






      2





      Perhaps a very dumb question.



      I am trying to "vectorize" the following loop:



      set.seed(0)
      x <- round(runif(10), 2)
      # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
      sig <- sample.int(10)
      # [1] 1 2 9 5 3 4 8 6 7 10
      for (i in seq_along(sig)) x[i] <- x[sig[i]]
      x
      # [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


      I think it is simply x[sig] but the result does not match.



      set.seed(0)
      x <- round(runif(10), 2)
      x[sig]
      # [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


      What's wrong?










      share|improve this question















      Perhaps a very dumb question.



      I am trying to "vectorize" the following loop:



      set.seed(0)
      x <- round(runif(10), 2)
      # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
      sig <- sample.int(10)
      # [1] 1 2 9 5 3 4 8 6 7 10
      for (i in seq_along(sig)) x[i] <- x[sig[i]]
      x
      # [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


      I think it is simply x[sig] but the result does not match.



      set.seed(0)
      x <- round(runif(10), 2)
      x[sig]
      # [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


      What's wrong?







      r loops for-loop vectorization






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 53 mins ago

























      asked 1 hour ago









      李哲源

      45.4k1489133




      45.4k1489133






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          8
          down vote













          There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



          By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



          Which piece of code is correct depends on what we want to do.



          If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



          If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




          Remark:



          This Q & A is motivated by this Q & A. OP originally presented a loop



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, mat[j, "rm"]]




          It is tempting to "vectorize" it as



          mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


          but it is actually wrong. Later OP changed the loop to



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




          which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






          share|improve this answer




















          • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
            – 42-
            26 mins ago











          • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
            – æŽå“²æº
            22 mins ago






          • 1




            I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
            – 42-
            22 mins ago







          • 1




            @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
            – æŽå“²æº
            16 mins ago

















          up vote
          1
          down vote













          There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



          With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



          set.seed(0)
          x <- x2 <- round(runif(10), 2)
          # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
          sig <- sample.int(10)
          # [1] 1 2 9 5 3 4 8 6 7 10
          for (i in seq_along(sig)) x2[i] <- x[sig[i]]
          identical(x2, x[sig])
          #TRUE


          No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.





          share




















          • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
            – æŽå“²æº
            21 secs ago










          Your Answer





          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52597296%2fwhy-vectorizing-this-simple-r-loop-gives-wrong-result%23new-answer', 'question_page');

          );

          Post as a guest






























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          8
          down vote













          There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



          By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



          Which piece of code is correct depends on what we want to do.



          If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



          If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




          Remark:



          This Q & A is motivated by this Q & A. OP originally presented a loop



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, mat[j, "rm"]]




          It is tempting to "vectorize" it as



          mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


          but it is actually wrong. Later OP changed the loop to



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




          which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






          share|improve this answer




















          • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
            – 42-
            26 mins ago











          • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
            – æŽå“²æº
            22 mins ago






          • 1




            I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
            – 42-
            22 mins ago







          • 1




            @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
            – æŽå“²æº
            16 mins ago














          up vote
          8
          down vote













          There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



          By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



          Which piece of code is correct depends on what we want to do.



          If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



          If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




          Remark:



          This Q & A is motivated by this Q & A. OP originally presented a loop



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, mat[j, "rm"]]




          It is tempting to "vectorize" it as



          mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


          but it is actually wrong. Later OP changed the loop to



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




          which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






          share|improve this answer




















          • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
            – 42-
            26 mins ago











          • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
            – æŽå“²æº
            22 mins ago






          • 1




            I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
            – 42-
            22 mins ago







          • 1




            @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
            – æŽå“²æº
            16 mins ago












          up vote
          8
          down vote










          up vote
          8
          down vote









          There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



          By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



          Which piece of code is correct depends on what we want to do.



          If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



          If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




          Remark:



          This Q & A is motivated by this Q & A. OP originally presented a loop



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, mat[j, "rm"]]




          It is tempting to "vectorize" it as



          mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


          but it is actually wrong. Later OP changed the loop to



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




          which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






          share|improve this answer












          There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



          By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



          Which piece of code is correct depends on what we want to do.



          If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



          If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




          Remark:



          This Q & A is motivated by this Q & A. OP originally presented a loop



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, mat[j, "rm"]]




          It is tempting to "vectorize" it as



          mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


          but it is actually wrong. Later OP changed the loop to



          for (i in 1:num) 
          for (j in 1:num)
          mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




          which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 1 hour ago









          李哲源

          45.4k1489133




          45.4k1489133











          • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
            – 42-
            26 mins ago











          • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
            – æŽå“²æº
            22 mins ago






          • 1




            I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
            – 42-
            22 mins ago







          • 1




            @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
            – æŽå“²æº
            16 mins ago
















          • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
            – 42-
            26 mins ago











          • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
            – æŽå“²æº
            22 mins ago






          • 1




            I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
            – 42-
            22 mins ago







          • 1




            @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
            – æŽå“²æº
            16 mins ago















          I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
          – 42-
          26 mins ago





          I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
          – 42-
          26 mins ago













          @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
          – æŽå“²æº
          22 mins ago




          @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
          – æŽå“²æº
          22 mins ago




          1




          1




          I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
          – 42-
          22 mins ago





          I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
          – 42-
          22 mins ago





          1




          1




          @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
          – æŽå“²æº
          16 mins ago




          @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
          – æŽå“²æº
          16 mins ago












          up vote
          1
          down vote













          There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



          With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



          set.seed(0)
          x <- x2 <- round(runif(10), 2)
          # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
          sig <- sample.int(10)
          # [1] 1 2 9 5 3 4 8 6 7 10
          for (i in seq_along(sig)) x2[i] <- x[sig[i]]
          identical(x2, x[sig])
          #TRUE


          No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.





          share




















          • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
            – æŽå“²æº
            21 secs ago














          up vote
          1
          down vote













          There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



          With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



          set.seed(0)
          x <- x2 <- round(runif(10), 2)
          # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
          sig <- sample.int(10)
          # [1] 1 2 9 5 3 4 8 6 7 10
          for (i in seq_along(sig)) x2[i] <- x[sig[i]]
          identical(x2, x[sig])
          #TRUE


          No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.





          share




















          • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
            – æŽå“²æº
            21 secs ago












          up vote
          1
          down vote










          up vote
          1
          down vote









          There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



          With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



          set.seed(0)
          x <- x2 <- round(runif(10), 2)
          # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
          sig <- sample.int(10)
          # [1] 1 2 9 5 3 4 8 6 7 10
          for (i in seq_along(sig)) x2[i] <- x[sig[i]]
          identical(x2, x[sig])
          #TRUE


          No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.





          share












          There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



          With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



          set.seed(0)
          x <- x2 <- round(runif(10), 2)
          # [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
          sig <- sample.int(10)
          # [1] 1 2 9 5 3 4 8 6 7 10
          for (i in seq_along(sig)) x2[i] <- x[sig[i]]
          identical(x2, x[sig])
          #TRUE


          No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.






          share











          share


          share










          answered 5 mins ago









          lebatsnok

          3,56811118




          3,56811118











          • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
            – æŽå“²æº
            21 secs ago
















          • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
            – æŽå“²æº
            21 secs ago















          Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
          – æŽå“²æº
          21 secs ago




          Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
          – æŽå“²æº
          21 secs ago

















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52597296%2fwhy-vectorizing-this-simple-r-loop-gives-wrong-result%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          Long meetings (6-7 hours a day): Being “babysat” by supervisor

          Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

          Confectionery