Why does “vectorizing” this simple R loop give the wrong result?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
9
down vote

favorite
3












Perhaps a very dumb question.



I am trying to "vectorize" the following loop:



set.seed(0)
x <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x[i] <- x[sig[i]]
x
# [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


I think it is simply x[sig] but the result does not match.



set.seed(0)
x <- round(runif(10), 2)
x[sig]
# [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


What's wrong?










share|improve this question























  • I know that the for loop and x[sig] are doing different things. The purpose of this Q & A is not to judge which is correct, but aiming at showing why the latter is not a "vectorized" implementation of the former. The meaning of x[sig] is clear: permutation, hence many people tend to believe that the loop is just doing some wrong stuff. But never be so sure. It can be some well-defined dynamic process.
    – æŽå“²æº
    1 hour ago











  • The lessons are two folds: 1> given a for loop, watch out for potential "vectorization" hazard before trying to "vectorize" it; 2> given a meaningful operation, double think whether a loop implementation is correct.
    – æŽå“²æº
    1 hour ago















up vote
9
down vote

favorite
3












Perhaps a very dumb question.



I am trying to "vectorize" the following loop:



set.seed(0)
x <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x[i] <- x[sig[i]]
x
# [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


I think it is simply x[sig] but the result does not match.



set.seed(0)
x <- round(runif(10), 2)
x[sig]
# [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


What's wrong?










share|improve this question























  • I know that the for loop and x[sig] are doing different things. The purpose of this Q & A is not to judge which is correct, but aiming at showing why the latter is not a "vectorized" implementation of the former. The meaning of x[sig] is clear: permutation, hence many people tend to believe that the loop is just doing some wrong stuff. But never be so sure. It can be some well-defined dynamic process.
    – æŽå“²æº
    1 hour ago











  • The lessons are two folds: 1> given a for loop, watch out for potential "vectorization" hazard before trying to "vectorize" it; 2> given a meaningful operation, double think whether a loop implementation is correct.
    – æŽå“²æº
    1 hour ago













up vote
9
down vote

favorite
3









up vote
9
down vote

favorite
3






3





Perhaps a very dumb question.



I am trying to "vectorize" the following loop:



set.seed(0)
x <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x[i] <- x[sig[i]]
x
# [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


I think it is simply x[sig] but the result does not match.



set.seed(0)
x <- round(runif(10), 2)
x[sig]
# [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


What's wrong?










share|improve this question















Perhaps a very dumb question.



I am trying to "vectorize" the following loop:



set.seed(0)
x <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x[i] <- x[sig[i]]
x
# [1] 0.90 0.27 0.66 0.91 0.66 0.91 0.94 0.91 0.94 0.63


I think it is simply x[sig] but the result does not match.



set.seed(0)
x <- round(runif(10), 2)
x[sig]
# [1] 0.90 0.27 0.66 0.91 0.37 0.57 0.94 0.20 0.90 0.63


What's wrong?







r loops for-loop vectorization






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 12 mins ago









Boann

35.8k1285116




35.8k1285116










asked 2 hours ago









李哲源

45.4k1489135




45.4k1489135











  • I know that the for loop and x[sig] are doing different things. The purpose of this Q & A is not to judge which is correct, but aiming at showing why the latter is not a "vectorized" implementation of the former. The meaning of x[sig] is clear: permutation, hence many people tend to believe that the loop is just doing some wrong stuff. But never be so sure. It can be some well-defined dynamic process.
    – æŽå“²æº
    1 hour ago











  • The lessons are two folds: 1> given a for loop, watch out for potential "vectorization" hazard before trying to "vectorize" it; 2> given a meaningful operation, double think whether a loop implementation is correct.
    – æŽå“²æº
    1 hour ago

















  • I know that the for loop and x[sig] are doing different things. The purpose of this Q & A is not to judge which is correct, but aiming at showing why the latter is not a "vectorized" implementation of the former. The meaning of x[sig] is clear: permutation, hence many people tend to believe that the loop is just doing some wrong stuff. But never be so sure. It can be some well-defined dynamic process.
    – æŽå“²æº
    1 hour ago











  • The lessons are two folds: 1> given a for loop, watch out for potential "vectorization" hazard before trying to "vectorize" it; 2> given a meaningful operation, double think whether a loop implementation is correct.
    – æŽå“²æº
    1 hour ago
















I know that the for loop and x[sig] are doing different things. The purpose of this Q & A is not to judge which is correct, but aiming at showing why the latter is not a "vectorized" implementation of the former. The meaning of x[sig] is clear: permutation, hence many people tend to believe that the loop is just doing some wrong stuff. But never be so sure. It can be some well-defined dynamic process.
– æŽå“²æº
1 hour ago





I know that the for loop and x[sig] are doing different things. The purpose of this Q & A is not to judge which is correct, but aiming at showing why the latter is not a "vectorized" implementation of the former. The meaning of x[sig] is clear: permutation, hence many people tend to believe that the loop is just doing some wrong stuff. But never be so sure. It can be some well-defined dynamic process.
– æŽå“²æº
1 hour ago













The lessons are two folds: 1> given a for loop, watch out for potential "vectorization" hazard before trying to "vectorize" it; 2> given a meaningful operation, double think whether a loop implementation is correct.
– æŽå“²æº
1 hour ago





The lessons are two folds: 1> given a for loop, watch out for potential "vectorization" hazard before trying to "vectorize" it; 2> given a meaningful operation, double think whether a loop implementation is correct.
– æŽå“²æº
1 hour ago













2 Answers
2






active

oldest

votes

















up vote
11
down vote













There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



Which piece of code is correct depends on what we want to do.



If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




Remark:



This Q & A is motivated by this Q & A. OP originally presented a loop



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, mat[j, "rm"]]




It is tempting to "vectorize" it as



mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


but it is actually wrong. Later OP changed the loop to



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






share|improve this answer




















  • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
    – 42-
    2 hours ago











  • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
    – æŽå“²æº
    2 hours ago






  • 2




    I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
    – 42-
    2 hours ago







  • 1




    @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
    – æŽå“²æº
    2 hours ago

















up vote
2
down vote













There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



set.seed(0)
x <- x2 <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x2[i] <- x[sig[i]]
identical(x2, x[sig])
#TRUE


No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.






share|improve this answer




















  • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
    – æŽå“²æº
    1 hour ago










Your Answer





StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52597296%2fwhy-does-vectorizing-this-simple-r-loop-give-the-wrong-result%23new-answer', 'question_page');

);

Post as a guest






























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
11
down vote













There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



Which piece of code is correct depends on what we want to do.



If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




Remark:



This Q & A is motivated by this Q & A. OP originally presented a loop



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, mat[j, "rm"]]




It is tempting to "vectorize" it as



mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


but it is actually wrong. Later OP changed the loop to



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






share|improve this answer




















  • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
    – 42-
    2 hours ago











  • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
    – æŽå“²æº
    2 hours ago






  • 2




    I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
    – 42-
    2 hours ago







  • 1




    @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
    – æŽå“²æº
    2 hours ago














up vote
11
down vote













There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



Which piece of code is correct depends on what we want to do.



If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




Remark:



This Q & A is motivated by this Q & A. OP originally presented a loop



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, mat[j, "rm"]]




It is tempting to "vectorize" it as



mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


but it is actually wrong. Later OP changed the loop to



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






share|improve this answer




















  • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
    – 42-
    2 hours ago











  • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
    – æŽå“²æº
    2 hours ago






  • 2




    I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
    – 42-
    2 hours ago







  • 1




    @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
    – æŽå“²æº
    2 hours ago












up vote
11
down vote










up vote
11
down vote









There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



Which piece of code is correct depends on what we want to do.



If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




Remark:



This Q & A is motivated by this Q & A. OP originally presented a loop



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, mat[j, "rm"]]




It is tempting to "vectorize" it as



mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


but it is actually wrong. Later OP changed the loop to



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.






share|improve this answer












There is actually a trap here: address aliasing. The loop reads x and writes to x. The memory block it reads overlaps the memory block it writes to. Such self-reference introduces loop dependency and is a hazard for "vectorization".



By contrast, x[sig] creates a new memory block for writing, eliminating address aliasing.



Which piece of code is correct depends on what we want to do.



If we want to perform a shuffling / permutation of x, then x[sig] is the right one. The loop hopes to do "in-place" permutation without using extra memory, but "in-place" permutation is in fact a more complicated operation: not only entries of x need be swapped, entries of sig also need be swapped along the iteration.



If we deem the loop as the correct thing, then there is no way to "vectorize" it. Well, if implementing the loop in Rcpp is seen as a "vectorization" then let it be. But there is no chance to further "vectorize" the C / C++ loop with SIMD.




Remark:



This Q & A is motivated by this Q & A. OP originally presented a loop



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, mat[j, "rm"]]




It is tempting to "vectorize" it as



mat[1:num, 1:num] <- mat[1:num, mat[1:num, "rm"]]


but it is actually wrong. Later OP changed the loop to



for (i in 1:num) 
for (j in 1:num)
mat[i, j] <- mat[i, 1 + num + mat[j, "rm"]]




which eliminates the address aliasing issue, because the columns to be replaced are the first num columns, while the columns to be looked up are after the first num columns.







share|improve this answer












share|improve this answer



share|improve this answer










answered 2 hours ago









李哲源

45.4k1489135




45.4k1489135











  • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
    – 42-
    2 hours ago











  • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
    – æŽå“²æº
    2 hours ago






  • 2




    I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
    – 42-
    2 hours ago







  • 1




    @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
    – æŽå“²æº
    2 hours ago
















  • I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
    – 42-
    2 hours ago











  • @42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
    – æŽå“²æº
    2 hours ago






  • 2




    I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
    – 42-
    2 hours ago







  • 1




    @42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
    – æŽå“²æº
    2 hours ago















I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
– 42-
2 hours ago





I agree sommewhat with the first paragraph. it is the assignment to x from x sequentially versus "en bloc" that causes the discrepancy, but there is never over-writing of a "memory block". R does not make assignments "in place". Rather it makes a temporary copy of the original and renames it. And I would also not say it is a danger of "vectorization" since you were not really using what is called vectorization when using a for-loop. I would have considered the vectorized result correct and the for-loop method as incorrect.
– 42-
2 hours ago













@42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
– æŽå“²æº
2 hours ago




@42- I think as long as the data to be modified are of the same mode as the original data, replacement or update of vector / matrix elements are indeed "in-place". You can try adding a tracemem(x) before the loop, and you will see no memory allocation message along the loop.
– æŽå“²æº
2 hours ago




2




2




I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
– 42-
2 hours ago





I will be very surprised if this turns out to be the case. I'll try to track down more authoritative documentation.
– 42-
2 hours ago





1




1




@42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
– æŽå“²æº
2 hours ago




@42- Thank you. If you every find anything, feel free to post it as an answer. I agree that my way of think of a loop is "C"-fashioned.
– æŽå“²æº
2 hours ago












up vote
2
down vote













There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



set.seed(0)
x <- x2 <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x2[i] <- x[sig[i]]
identical(x2, x[sig])
#TRUE


No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.






share|improve this answer




















  • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
    – æŽå“²æº
    1 hour ago














up vote
2
down vote













There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



set.seed(0)
x <- x2 <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x2[i] <- x[sig[i]]
identical(x2, x[sig])
#TRUE


No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.






share|improve this answer




















  • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
    – æŽå“²æº
    1 hour ago












up vote
2
down vote










up vote
2
down vote









There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



set.seed(0)
x <- x2 <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x2[i] <- x[sig[i]]
identical(x2, x[sig])
#TRUE


No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.






share|improve this answer












There is a simpler explanation. With your loop, you are overwriting one element of x at every step, replacing its former value by one of the other elements of x. So you get what you asked for. Essentially, it is a complicated form of sampling with replacement (sample(x, replace=TRUE)) -- whether you need such a complication, depends on what you want to achieve.



With your vectorized code, you are just asking for a certain permutation of x (without replacement), and that is what you get. The vectorized code is not doing the same thing as your loop. If you want to achieve the same result with a loop, you would first need to make a copy of x:



set.seed(0)
x <- x2 <- round(runif(10), 2)
# [1] 0.90 0.27 0.37 0.57 0.91 0.20 0.90 0.94 0.66 0.63
sig <- sample.int(10)
# [1] 1 2 9 5 3 4 8 6 7 10
for (i in seq_along(sig)) x2[i] <- x[sig[i]]
identical(x2, x[sig])
#TRUE


No danger of aliasing here: x and x2 refer initially to the same memory location but his will change as soon as you change the first element of x2.







share|improve this answer












share|improve this answer



share|improve this answer










answered 1 hour ago









lebatsnok

3,57811118




3,57811118











  • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
    – æŽå“²æº
    1 hour ago
















  • Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
    – æŽå“²æº
    1 hour ago















Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
– æŽå“²æº
1 hour ago




Yes, I know that the loop and x[sig] are different. Maybe I did not make this clear in my elaboration... But your interpreting the former as sample with replacement and the latter as sample without replacement is interesting. While it may not be precise (as given the sig, the result of the loop and x[sig] are both deterministic), it is indeed a different view of the issue.
– æŽå“²æº
1 hour ago

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52597296%2fwhy-does-vectorizing-this-simple-r-loop-give-the-wrong-result%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

Long meetings (6-7 hours a day): Being “babysat” by supervisor

Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

Confectionery