Populating an array in do loop, allocating memory or not

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite












I am doing a calculation where I have a do loop can be as long as 1 million. In each loop, I want to store a value into an array to plot afterward. In MATLAB normally I was preallocating the memory for the array like a=zeros(1000000,1); I have been told that in Mathematica I don't need to do that I can simply do a= and in every loop a[[i]]=number. Is this true? Or I should be doing something like a=ConstantArray0,1000000; ?



Thank you







share|improve this question
























    up vote
    3
    down vote

    favorite












    I am doing a calculation where I have a do loop can be as long as 1 million. In each loop, I want to store a value into an array to plot afterward. In MATLAB normally I was preallocating the memory for the array like a=zeros(1000000,1); I have been told that in Mathematica I don't need to do that I can simply do a= and in every loop a[[i]]=number. Is this true? Or I should be doing something like a=ConstantArray0,1000000; ?



    Thank you







    share|improve this question






















      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      I am doing a calculation where I have a do loop can be as long as 1 million. In each loop, I want to store a value into an array to plot afterward. In MATLAB normally I was preallocating the memory for the array like a=zeros(1000000,1); I have been told that in Mathematica I don't need to do that I can simply do a= and in every loop a[[i]]=number. Is this true? Or I should be doing something like a=ConstantArray0,1000000; ?



      Thank you







      share|improve this question












      I am doing a calculation where I have a do loop can be as long as 1 million. In each loop, I want to store a value into an array to plot afterward. In MATLAB normally I was preallocating the memory for the array like a=zeros(1000000,1); I have been told that in Mathematica I don't need to do that I can simply do a= and in every loop a[[i]]=number. Is this true? Or I should be doing something like a=ConstantArray0,1000000; ?



      Thank you









      share|improve this question











      share|improve this question




      share|improve this question










      asked Aug 22 at 9:30









      Erdem

      393110




      393110




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          4
          down vote



          accepted










          The approach a= and then setting a[[i]]=number is not possible. One could use Append but it should be avoided for the same reasons as such methods should be avoided in Matlab.



          If you cannot use Table to construct the array, preallocation with a = ConstantArray[0,1000000]; is one way to go. If your array is supposed to contain machine floats then use a = ConstantArray[0.,1000000]; and make sure that all values written into the array are also machine floats (otherwise, the array will be unpacked). For complex machine floats use ConstantArray[0. + O. I,1000000].



          In general, this approach will be rather slow as Mathematica is not very well at (uncompiled) loops of any kind. Often it is also advantage for performance to put the code into Compile -- provided that the code for generating the values is compilable.



          Examples



          Here are three possible approaches for the additive assembly of a dense vector: One with Do (producing a); one with SparseArray (producing b, notice that one has to go through some pain to make it assemble additively); and one using a compiled helper function (producing c).



          n = 1000000;
          pos = RandomInteger[1, n, 10 n];
          vals = RandomReal[-1, 1, 10 n];

          (
          a = ConstantArray[0., n];
          Do[a[[pos[[i]]]] += vals[[i]], i, 1, Length[pos]];
          ) // AbsoluteTiming // First

          b = With[spopt = SystemOptions["SparseArrayOptions"],
          Internal`WithLocalSettings[
          SetSystemOptions[
          "SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

          SparseArray[Partition[pos, 1] -> vals, n],

          SetSystemOptions[spopt]]
          ]; // AbsoluteTiming // First


          (* compiled helper function; can be reused. Make sure that n greater or equal to Max[pos]. *)

          cAssembleVector = Compile[pos, _Integer, 1, vals, _Real, 1, n, _Integer,
          Block[a,
          (* We use Table because ConstantArray is not compilable. *)
          a = Table[0., n];
          Do[
          (* Compile`GetElement is a read-only substitute of Part that skips bound checks and thus is faster that Part. Use with caution. *)
          a[[Compile`GetElement[pos, i]]] += Compile`GetElement[vals, i],
          i, 1, Length[pos]
          ];
          a
          ],
          CompilationTarget -> "C",
          RuntimeOptions -> "Speed"
          ];

          c = cAssembleVector[pos, vals, n]; // AbsoluteTiming // First

          a == b == c



          23.4489



          1.59865



          0.073619



          True




          You see, Do performs really bad. Best is to use a compiled approach: it 300 times faster than Do. It might seem a bit tedious, but using Compile is still much easier than using mex-functions in Matlab.



          Notice that this vector asembly features random read and write access and is somewhat a rather hard problem. There might be faster methods for generating more structured arrays.



          Final remark



          If you don't know the length of the array in advance, Internal`Bag might also serve as efficient substitute for Sow and Reap, in particular in compiled code. It is undocumented, though.






          share|improve this answer





























            up vote
            3
            down vote













            If you know the number of elements in advance, use Table.



            If you don't, use Sow and Reap.



            Given that you talk about preallocation and Do loops, it seems to me that you do know the number of elements, thus go with Table, or possibly with Array.






            share|improve this answer




















              Your Answer




              StackExchange.ifUsing("editor", function ()
              return StackExchange.using("mathjaxEditing", function ()
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              );
              );
              , "mathjax-editing");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "387"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f180433%2fpopulating-an-array-in-do-loop-allocating-memory-or-not%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              4
              down vote



              accepted










              The approach a= and then setting a[[i]]=number is not possible. One could use Append but it should be avoided for the same reasons as such methods should be avoided in Matlab.



              If you cannot use Table to construct the array, preallocation with a = ConstantArray[0,1000000]; is one way to go. If your array is supposed to contain machine floats then use a = ConstantArray[0.,1000000]; and make sure that all values written into the array are also machine floats (otherwise, the array will be unpacked). For complex machine floats use ConstantArray[0. + O. I,1000000].



              In general, this approach will be rather slow as Mathematica is not very well at (uncompiled) loops of any kind. Often it is also advantage for performance to put the code into Compile -- provided that the code for generating the values is compilable.



              Examples



              Here are three possible approaches for the additive assembly of a dense vector: One with Do (producing a); one with SparseArray (producing b, notice that one has to go through some pain to make it assemble additively); and one using a compiled helper function (producing c).



              n = 1000000;
              pos = RandomInteger[1, n, 10 n];
              vals = RandomReal[-1, 1, 10 n];

              (
              a = ConstantArray[0., n];
              Do[a[[pos[[i]]]] += vals[[i]], i, 1, Length[pos]];
              ) // AbsoluteTiming // First

              b = With[spopt = SystemOptions["SparseArrayOptions"],
              Internal`WithLocalSettings[
              SetSystemOptions[
              "SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

              SparseArray[Partition[pos, 1] -> vals, n],

              SetSystemOptions[spopt]]
              ]; // AbsoluteTiming // First


              (* compiled helper function; can be reused. Make sure that n greater or equal to Max[pos]. *)

              cAssembleVector = Compile[pos, _Integer, 1, vals, _Real, 1, n, _Integer,
              Block[a,
              (* We use Table because ConstantArray is not compilable. *)
              a = Table[0., n];
              Do[
              (* Compile`GetElement is a read-only substitute of Part that skips bound checks and thus is faster that Part. Use with caution. *)
              a[[Compile`GetElement[pos, i]]] += Compile`GetElement[vals, i],
              i, 1, Length[pos]
              ];
              a
              ],
              CompilationTarget -> "C",
              RuntimeOptions -> "Speed"
              ];

              c = cAssembleVector[pos, vals, n]; // AbsoluteTiming // First

              a == b == c



              23.4489



              1.59865



              0.073619



              True




              You see, Do performs really bad. Best is to use a compiled approach: it 300 times faster than Do. It might seem a bit tedious, but using Compile is still much easier than using mex-functions in Matlab.



              Notice that this vector asembly features random read and write access and is somewhat a rather hard problem. There might be faster methods for generating more structured arrays.



              Final remark



              If you don't know the length of the array in advance, Internal`Bag might also serve as efficient substitute for Sow and Reap, in particular in compiled code. It is undocumented, though.






              share|improve this answer


























                up vote
                4
                down vote



                accepted










                The approach a= and then setting a[[i]]=number is not possible. One could use Append but it should be avoided for the same reasons as such methods should be avoided in Matlab.



                If you cannot use Table to construct the array, preallocation with a = ConstantArray[0,1000000]; is one way to go. If your array is supposed to contain machine floats then use a = ConstantArray[0.,1000000]; and make sure that all values written into the array are also machine floats (otherwise, the array will be unpacked). For complex machine floats use ConstantArray[0. + O. I,1000000].



                In general, this approach will be rather slow as Mathematica is not very well at (uncompiled) loops of any kind. Often it is also advantage for performance to put the code into Compile -- provided that the code for generating the values is compilable.



                Examples



                Here are three possible approaches for the additive assembly of a dense vector: One with Do (producing a); one with SparseArray (producing b, notice that one has to go through some pain to make it assemble additively); and one using a compiled helper function (producing c).



                n = 1000000;
                pos = RandomInteger[1, n, 10 n];
                vals = RandomReal[-1, 1, 10 n];

                (
                a = ConstantArray[0., n];
                Do[a[[pos[[i]]]] += vals[[i]], i, 1, Length[pos]];
                ) // AbsoluteTiming // First

                b = With[spopt = SystemOptions["SparseArrayOptions"],
                Internal`WithLocalSettings[
                SetSystemOptions[
                "SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

                SparseArray[Partition[pos, 1] -> vals, n],

                SetSystemOptions[spopt]]
                ]; // AbsoluteTiming // First


                (* compiled helper function; can be reused. Make sure that n greater or equal to Max[pos]. *)

                cAssembleVector = Compile[pos, _Integer, 1, vals, _Real, 1, n, _Integer,
                Block[a,
                (* We use Table because ConstantArray is not compilable. *)
                a = Table[0., n];
                Do[
                (* Compile`GetElement is a read-only substitute of Part that skips bound checks and thus is faster that Part. Use with caution. *)
                a[[Compile`GetElement[pos, i]]] += Compile`GetElement[vals, i],
                i, 1, Length[pos]
                ];
                a
                ],
                CompilationTarget -> "C",
                RuntimeOptions -> "Speed"
                ];

                c = cAssembleVector[pos, vals, n]; // AbsoluteTiming // First

                a == b == c



                23.4489



                1.59865



                0.073619



                True




                You see, Do performs really bad. Best is to use a compiled approach: it 300 times faster than Do. It might seem a bit tedious, but using Compile is still much easier than using mex-functions in Matlab.



                Notice that this vector asembly features random read and write access and is somewhat a rather hard problem. There might be faster methods for generating more structured arrays.



                Final remark



                If you don't know the length of the array in advance, Internal`Bag might also serve as efficient substitute for Sow and Reap, in particular in compiled code. It is undocumented, though.






                share|improve this answer
























                  up vote
                  4
                  down vote



                  accepted







                  up vote
                  4
                  down vote



                  accepted






                  The approach a= and then setting a[[i]]=number is not possible. One could use Append but it should be avoided for the same reasons as such methods should be avoided in Matlab.



                  If you cannot use Table to construct the array, preallocation with a = ConstantArray[0,1000000]; is one way to go. If your array is supposed to contain machine floats then use a = ConstantArray[0.,1000000]; and make sure that all values written into the array are also machine floats (otherwise, the array will be unpacked). For complex machine floats use ConstantArray[0. + O. I,1000000].



                  In general, this approach will be rather slow as Mathematica is not very well at (uncompiled) loops of any kind. Often it is also advantage for performance to put the code into Compile -- provided that the code for generating the values is compilable.



                  Examples



                  Here are three possible approaches for the additive assembly of a dense vector: One with Do (producing a); one with SparseArray (producing b, notice that one has to go through some pain to make it assemble additively); and one using a compiled helper function (producing c).



                  n = 1000000;
                  pos = RandomInteger[1, n, 10 n];
                  vals = RandomReal[-1, 1, 10 n];

                  (
                  a = ConstantArray[0., n];
                  Do[a[[pos[[i]]]] += vals[[i]], i, 1, Length[pos]];
                  ) // AbsoluteTiming // First

                  b = With[spopt = SystemOptions["SparseArrayOptions"],
                  Internal`WithLocalSettings[
                  SetSystemOptions[
                  "SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

                  SparseArray[Partition[pos, 1] -> vals, n],

                  SetSystemOptions[spopt]]
                  ]; // AbsoluteTiming // First


                  (* compiled helper function; can be reused. Make sure that n greater or equal to Max[pos]. *)

                  cAssembleVector = Compile[pos, _Integer, 1, vals, _Real, 1, n, _Integer,
                  Block[a,
                  (* We use Table because ConstantArray is not compilable. *)
                  a = Table[0., n];
                  Do[
                  (* Compile`GetElement is a read-only substitute of Part that skips bound checks and thus is faster that Part. Use with caution. *)
                  a[[Compile`GetElement[pos, i]]] += Compile`GetElement[vals, i],
                  i, 1, Length[pos]
                  ];
                  a
                  ],
                  CompilationTarget -> "C",
                  RuntimeOptions -> "Speed"
                  ];

                  c = cAssembleVector[pos, vals, n]; // AbsoluteTiming // First

                  a == b == c



                  23.4489



                  1.59865



                  0.073619



                  True




                  You see, Do performs really bad. Best is to use a compiled approach: it 300 times faster than Do. It might seem a bit tedious, but using Compile is still much easier than using mex-functions in Matlab.



                  Notice that this vector asembly features random read and write access and is somewhat a rather hard problem. There might be faster methods for generating more structured arrays.



                  Final remark



                  If you don't know the length of the array in advance, Internal`Bag might also serve as efficient substitute for Sow and Reap, in particular in compiled code. It is undocumented, though.






                  share|improve this answer














                  The approach a= and then setting a[[i]]=number is not possible. One could use Append but it should be avoided for the same reasons as such methods should be avoided in Matlab.



                  If you cannot use Table to construct the array, preallocation with a = ConstantArray[0,1000000]; is one way to go. If your array is supposed to contain machine floats then use a = ConstantArray[0.,1000000]; and make sure that all values written into the array are also machine floats (otherwise, the array will be unpacked). For complex machine floats use ConstantArray[0. + O. I,1000000].



                  In general, this approach will be rather slow as Mathematica is not very well at (uncompiled) loops of any kind. Often it is also advantage for performance to put the code into Compile -- provided that the code for generating the values is compilable.



                  Examples



                  Here are three possible approaches for the additive assembly of a dense vector: One with Do (producing a); one with SparseArray (producing b, notice that one has to go through some pain to make it assemble additively); and one using a compiled helper function (producing c).



                  n = 1000000;
                  pos = RandomInteger[1, n, 10 n];
                  vals = RandomReal[-1, 1, 10 n];

                  (
                  a = ConstantArray[0., n];
                  Do[a[[pos[[i]]]] += vals[[i]], i, 1, Length[pos]];
                  ) // AbsoluteTiming // First

                  b = With[spopt = SystemOptions["SparseArrayOptions"],
                  Internal`WithLocalSettings[
                  SetSystemOptions[
                  "SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

                  SparseArray[Partition[pos, 1] -> vals, n],

                  SetSystemOptions[spopt]]
                  ]; // AbsoluteTiming // First


                  (* compiled helper function; can be reused. Make sure that n greater or equal to Max[pos]. *)

                  cAssembleVector = Compile[pos, _Integer, 1, vals, _Real, 1, n, _Integer,
                  Block[a,
                  (* We use Table because ConstantArray is not compilable. *)
                  a = Table[0., n];
                  Do[
                  (* Compile`GetElement is a read-only substitute of Part that skips bound checks and thus is faster that Part. Use with caution. *)
                  a[[Compile`GetElement[pos, i]]] += Compile`GetElement[vals, i],
                  i, 1, Length[pos]
                  ];
                  a
                  ],
                  CompilationTarget -> "C",
                  RuntimeOptions -> "Speed"
                  ];

                  c = cAssembleVector[pos, vals, n]; // AbsoluteTiming // First

                  a == b == c



                  23.4489



                  1.59865



                  0.073619



                  True




                  You see, Do performs really bad. Best is to use a compiled approach: it 300 times faster than Do. It might seem a bit tedious, but using Compile is still much easier than using mex-functions in Matlab.



                  Notice that this vector asembly features random read and write access and is somewhat a rather hard problem. There might be faster methods for generating more structured arrays.



                  Final remark



                  If you don't know the length of the array in advance, Internal`Bag might also serve as efficient substitute for Sow and Reap, in particular in compiled code. It is undocumented, though.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Aug 22 at 10:43

























                  answered Aug 22 at 9:40









                  Henrik Schumacher

                  36.2k249102




                  36.2k249102




















                      up vote
                      3
                      down vote













                      If you know the number of elements in advance, use Table.



                      If you don't, use Sow and Reap.



                      Given that you talk about preallocation and Do loops, it seems to me that you do know the number of elements, thus go with Table, or possibly with Array.






                      share|improve this answer
























                        up vote
                        3
                        down vote













                        If you know the number of elements in advance, use Table.



                        If you don't, use Sow and Reap.



                        Given that you talk about preallocation and Do loops, it seems to me that you do know the number of elements, thus go with Table, or possibly with Array.






                        share|improve this answer






















                          up vote
                          3
                          down vote










                          up vote
                          3
                          down vote









                          If you know the number of elements in advance, use Table.



                          If you don't, use Sow and Reap.



                          Given that you talk about preallocation and Do loops, it seems to me that you do know the number of elements, thus go with Table, or possibly with Array.






                          share|improve this answer












                          If you know the number of elements in advance, use Table.



                          If you don't, use Sow and Reap.



                          Given that you talk about preallocation and Do loops, it seems to me that you do know the number of elements, thus go with Table, or possibly with Array.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Aug 22 at 9:52









                          Szabolcs

                          152k13415896




                          152k13415896



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f180433%2fpopulating-an-array-in-do-loop-allocating-memory-or-not%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Comments

                              Popular posts from this blog

                              What does second last employer means? [closed]

                              List of Gilmore Girls characters

                              One-line joke