List generation speed change at ~800 iterations

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
6
down vote

favorite
1












I recently have been developing a project for my students to explore some functionality of the Wolfram Language and have requested that they investigate the timing for several different methods of list generation. Here are the techniques i've asked them to explore:



p1 = 
ListPlot[
Table[First[Table[Prime[n], n, 1, k] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Black];
p2 =
ListPlot[
Table[First[Prime[Range[k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Red];
p3 =
ListPlot[
Table[First[Reap[Do[Sow[Prime[n]], n, k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Green];
p4 =
ListPlot[
Table[
First[Reap[For[n = 1, n <= k, n++, Sow[Prime[n]]]] // AbsoluteTiming],
k, 1, 200000, 1000],
PlotStyle -> Blue];
p5 =
ListPlot[
Table[First[Array[Prime[#] &, k] // AbsoluteTiming], k, 1, 200000,1000],
PlotStyle -> Orange];

Show[p1, p2, p3, p4, p5]


The result is the following graph:



Plot of timing as a function of iterations



Is there something special happening around $n=80000$? Looks like the rate of change of cost for creating a list is abruptly changing near this value.










share|improve this question



















  • 3




    Using Prime as test is somewhat atypical. It would provide much more insight using a function that is also vectorized such as f = Sin[N[#]] &;. I guess that Prime shifts from a very cheap algorithm for small primes (maybe a lookup table) to a more general, but also more expensive algorithm. The fact that the performance curves look continuous tells me that this switching is either well-balanced or that simply an additional effort has to be made for primes after the 80k-th one. Btw., Array[Prime, k] will speed up p5 a little.
    – Henrik Schumacher
    7 hours ago






  • 1




    @HenrikSchumacher I replaced Prime with Sin and no change in slope is visible.
    – mikado
    7 hours ago










  • @mikado Right. That was my point. Prime does not show how Mathematica works in general. I am not sure whether any real computations are done at all with Prime in this range of numbers; maybe only two different file formats for the storage of primes are used: one with low compression for the first 80000 primes and one with higher compression (and thus slower access) for the next 120000 primes.
    – Henrik Schumacher
    7 hours ago










  • Can this be the point where compilation no longer works because the numbers involved become too large to be represent by machine integers?
    – Sjoerd Smit
    7 hours ago










  • @SjoerdSmit I don't think so: Mathematica's integers are 64 bit long. N[Developer`$MaxMachineInteger/Prime[80000]] returns almost 10^13. So the machine integer border is far, far away.
    – Henrik Schumacher
    6 hours ago















up vote
6
down vote

favorite
1












I recently have been developing a project for my students to explore some functionality of the Wolfram Language and have requested that they investigate the timing for several different methods of list generation. Here are the techniques i've asked them to explore:



p1 = 
ListPlot[
Table[First[Table[Prime[n], n, 1, k] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Black];
p2 =
ListPlot[
Table[First[Prime[Range[k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Red];
p3 =
ListPlot[
Table[First[Reap[Do[Sow[Prime[n]], n, k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Green];
p4 =
ListPlot[
Table[
First[Reap[For[n = 1, n <= k, n++, Sow[Prime[n]]]] // AbsoluteTiming],
k, 1, 200000, 1000],
PlotStyle -> Blue];
p5 =
ListPlot[
Table[First[Array[Prime[#] &, k] // AbsoluteTiming], k, 1, 200000,1000],
PlotStyle -> Orange];

Show[p1, p2, p3, p4, p5]


The result is the following graph:



Plot of timing as a function of iterations



Is there something special happening around $n=80000$? Looks like the rate of change of cost for creating a list is abruptly changing near this value.










share|improve this question



















  • 3




    Using Prime as test is somewhat atypical. It would provide much more insight using a function that is also vectorized such as f = Sin[N[#]] &;. I guess that Prime shifts from a very cheap algorithm for small primes (maybe a lookup table) to a more general, but also more expensive algorithm. The fact that the performance curves look continuous tells me that this switching is either well-balanced or that simply an additional effort has to be made for primes after the 80k-th one. Btw., Array[Prime, k] will speed up p5 a little.
    – Henrik Schumacher
    7 hours ago






  • 1




    @HenrikSchumacher I replaced Prime with Sin and no change in slope is visible.
    – mikado
    7 hours ago










  • @mikado Right. That was my point. Prime does not show how Mathematica works in general. I am not sure whether any real computations are done at all with Prime in this range of numbers; maybe only two different file formats for the storage of primes are used: one with low compression for the first 80000 primes and one with higher compression (and thus slower access) for the next 120000 primes.
    – Henrik Schumacher
    7 hours ago










  • Can this be the point where compilation no longer works because the numbers involved become too large to be represent by machine integers?
    – Sjoerd Smit
    7 hours ago










  • @SjoerdSmit I don't think so: Mathematica's integers are 64 bit long. N[Developer`$MaxMachineInteger/Prime[80000]] returns almost 10^13. So the machine integer border is far, far away.
    – Henrik Schumacher
    6 hours ago













up vote
6
down vote

favorite
1









up vote
6
down vote

favorite
1






1





I recently have been developing a project for my students to explore some functionality of the Wolfram Language and have requested that they investigate the timing for several different methods of list generation. Here are the techniques i've asked them to explore:



p1 = 
ListPlot[
Table[First[Table[Prime[n], n, 1, k] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Black];
p2 =
ListPlot[
Table[First[Prime[Range[k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Red];
p3 =
ListPlot[
Table[First[Reap[Do[Sow[Prime[n]], n, k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Green];
p4 =
ListPlot[
Table[
First[Reap[For[n = 1, n <= k, n++, Sow[Prime[n]]]] // AbsoluteTiming],
k, 1, 200000, 1000],
PlotStyle -> Blue];
p5 =
ListPlot[
Table[First[Array[Prime[#] &, k] // AbsoluteTiming], k, 1, 200000,1000],
PlotStyle -> Orange];

Show[p1, p2, p3, p4, p5]


The result is the following graph:



Plot of timing as a function of iterations



Is there something special happening around $n=80000$? Looks like the rate of change of cost for creating a list is abruptly changing near this value.










share|improve this question















I recently have been developing a project for my students to explore some functionality of the Wolfram Language and have requested that they investigate the timing for several different methods of list generation. Here are the techniques i've asked them to explore:



p1 = 
ListPlot[
Table[First[Table[Prime[n], n, 1, k] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Black];
p2 =
ListPlot[
Table[First[Prime[Range[k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Red];
p3 =
ListPlot[
Table[First[Reap[Do[Sow[Prime[n]], n, k]] // AbsoluteTiming], k, 1, 200000, 1000],
PlotStyle -> Green];
p4 =
ListPlot[
Table[
First[Reap[For[n = 1, n <= k, n++, Sow[Prime[n]]]] // AbsoluteTiming],
k, 1, 200000, 1000],
PlotStyle -> Blue];
p5 =
ListPlot[
Table[First[Array[Prime[#] &, k] // AbsoluteTiming], k, 1, 200000,1000],
PlotStyle -> Orange];

Show[p1, p2, p3, p4, p5]


The result is the following graph:



Plot of timing as a function of iterations



Is there something special happening around $n=80000$? Looks like the rate of change of cost for creating a list is abruptly changing near this value.







list-manipulation timing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 5 hours ago









m_goldberg

82.2k869190




82.2k869190










asked 7 hours ago









JEM

462313




462313







  • 3




    Using Prime as test is somewhat atypical. It would provide much more insight using a function that is also vectorized such as f = Sin[N[#]] &;. I guess that Prime shifts from a very cheap algorithm for small primes (maybe a lookup table) to a more general, but also more expensive algorithm. The fact that the performance curves look continuous tells me that this switching is either well-balanced or that simply an additional effort has to be made for primes after the 80k-th one. Btw., Array[Prime, k] will speed up p5 a little.
    – Henrik Schumacher
    7 hours ago






  • 1




    @HenrikSchumacher I replaced Prime with Sin and no change in slope is visible.
    – mikado
    7 hours ago










  • @mikado Right. That was my point. Prime does not show how Mathematica works in general. I am not sure whether any real computations are done at all with Prime in this range of numbers; maybe only two different file formats for the storage of primes are used: one with low compression for the first 80000 primes and one with higher compression (and thus slower access) for the next 120000 primes.
    – Henrik Schumacher
    7 hours ago










  • Can this be the point where compilation no longer works because the numbers involved become too large to be represent by machine integers?
    – Sjoerd Smit
    7 hours ago










  • @SjoerdSmit I don't think so: Mathematica's integers are 64 bit long. N[Developer`$MaxMachineInteger/Prime[80000]] returns almost 10^13. So the machine integer border is far, far away.
    – Henrik Schumacher
    6 hours ago













  • 3




    Using Prime as test is somewhat atypical. It would provide much more insight using a function that is also vectorized such as f = Sin[N[#]] &;. I guess that Prime shifts from a very cheap algorithm for small primes (maybe a lookup table) to a more general, but also more expensive algorithm. The fact that the performance curves look continuous tells me that this switching is either well-balanced or that simply an additional effort has to be made for primes after the 80k-th one. Btw., Array[Prime, k] will speed up p5 a little.
    – Henrik Schumacher
    7 hours ago






  • 1




    @HenrikSchumacher I replaced Prime with Sin and no change in slope is visible.
    – mikado
    7 hours ago










  • @mikado Right. That was my point. Prime does not show how Mathematica works in general. I am not sure whether any real computations are done at all with Prime in this range of numbers; maybe only two different file formats for the storage of primes are used: one with low compression for the first 80000 primes and one with higher compression (and thus slower access) for the next 120000 primes.
    – Henrik Schumacher
    7 hours ago










  • Can this be the point where compilation no longer works because the numbers involved become too large to be represent by machine integers?
    – Sjoerd Smit
    7 hours ago










  • @SjoerdSmit I don't think so: Mathematica's integers are 64 bit long. N[Developer`$MaxMachineInteger/Prime[80000]] returns almost 10^13. So the machine integer border is far, far away.
    – Henrik Schumacher
    6 hours ago








3




3




Using Prime as test is somewhat atypical. It would provide much more insight using a function that is also vectorized such as f = Sin[N[#]] &;. I guess that Prime shifts from a very cheap algorithm for small primes (maybe a lookup table) to a more general, but also more expensive algorithm. The fact that the performance curves look continuous tells me that this switching is either well-balanced or that simply an additional effort has to be made for primes after the 80k-th one. Btw., Array[Prime, k] will speed up p5 a little.
– Henrik Schumacher
7 hours ago




Using Prime as test is somewhat atypical. It would provide much more insight using a function that is also vectorized such as f = Sin[N[#]] &;. I guess that Prime shifts from a very cheap algorithm for small primes (maybe a lookup table) to a more general, but also more expensive algorithm. The fact that the performance curves look continuous tells me that this switching is either well-balanced or that simply an additional effort has to be made for primes after the 80k-th one. Btw., Array[Prime, k] will speed up p5 a little.
– Henrik Schumacher
7 hours ago




1




1




@HenrikSchumacher I replaced Prime with Sin and no change in slope is visible.
– mikado
7 hours ago




@HenrikSchumacher I replaced Prime with Sin and no change in slope is visible.
– mikado
7 hours ago












@mikado Right. That was my point. Prime does not show how Mathematica works in general. I am not sure whether any real computations are done at all with Prime in this range of numbers; maybe only two different file formats for the storage of primes are used: one with low compression for the first 80000 primes and one with higher compression (and thus slower access) for the next 120000 primes.
– Henrik Schumacher
7 hours ago




@mikado Right. That was my point. Prime does not show how Mathematica works in general. I am not sure whether any real computations are done at all with Prime in this range of numbers; maybe only two different file formats for the storage of primes are used: one with low compression for the first 80000 primes and one with higher compression (and thus slower access) for the next 120000 primes.
– Henrik Schumacher
7 hours ago












Can this be the point where compilation no longer works because the numbers involved become too large to be represent by machine integers?
– Sjoerd Smit
7 hours ago




Can this be the point where compilation no longer works because the numbers involved become too large to be represent by machine integers?
– Sjoerd Smit
7 hours ago












@SjoerdSmit I don't think so: Mathematica's integers are 64 bit long. N[Developer`$MaxMachineInteger/Prime[80000]] returns almost 10^13. So the machine integer border is far, far away.
– Henrik Schumacher
6 hours ago





@SjoerdSmit I don't think so: Mathematica's integers are 64 bit long. N[Developer`$MaxMachineInteger/Prime[80000]] returns almost 10^13. So the machine integer border is far, far away.
– Henrik Schumacher
6 hours ago











1 Answer
1






active

oldest

votes

















up vote
3
down vote













Regarding this statement




Looks like the rate of change of cost for creating a list is abruptly changing near this value.




I have a different opinion. We can use RepeatedTiming to get a somewhat consistent measurement and check the runtime for one single Prime call taking the list creation out of the game



data1 = Table[i, First[RepeatedTiming[Prime[i]]], i, 1, 100000, 
100];
ListLinePlot[data1]


Mathematica graphics



So we see the same jump, and I believe it is a switch from one algorithm to the next in calculating the primes. On closer inspection, you will find that the position of the jump appears to be precisely at n=78499. Why? I guess the reason is that we reached one million:



Prime[78498]
Prime[78499]
(* 999983 *)
(* 1000003 *)


In the documentation we find




Prime and PrimePi use sparse caching and sieving. For large n, the Lagarias–Miller–Odlyzko algorithm for PrimePi is used, based on asymptotic estimates of the density of primes, and is inverted to give Prime.




So let's assume for n<78499 Mathematica uses caching. Then getting a result, mostly depends on how fast the lookup is. Of course, you need to take into account the real smart users, who, when using Prime[n] definitely will request Prime[n+1] (or 2,3,...).



Let us make an experiment. Instead of using RepeatedTiming, we will simply access dn primes and measure the time. So we will touch each prime but we measure in chunks:



With[dn = 100,
dataDoForward =
Table[i, First[AbsoluteTiming[Do[Prime[j], j, i, i + dn]]], i,
1, 150000, dn]
];
ListLinePlot[dataDoForward, PlotRange -> All]


Mathematica graphics



No surprise here. The same jump. However, what happens if we access the primes in the inner Do in reverse order?



With[dn = 100,
dataDoReverse =
Table[i,
First[AbsoluteTiming[Do[Prime[j], j, i + dn, i, -1]]], i, 1,
150000, dn]
];
ListLinePlot[dataDoReverse, PlotRange -> All]


Mathematica graphics



It seems the Wolfram Developers are counting on that we are somewhat reasonable people that don't do these stunts. Please pay particular attention to the order of the difference in time. For 100 primes around n=75000 we need under 0.1 microseconds, while the reverse access is almost 1000x slower.



So my conclusion without official information is that primes under 1 million are heavily cached and accessing them loads some larger primes so that the next access is faster. For lager primes, an algorithm is used that really calculates the result. This seems stable in timing, no matter if you access chunks of primes forward or backwards



ListLinePlot[dataDoForward, dataDoReverse, 
PlotRange -> Automatic, 0, 0.0005,
PlotLegends -> "Forward Access", "Backward Access"
]


Mathematica graphics



I cannot explain, why the forward access method is a bit slower at the beginning for primes larger 1 million, but it is possible that some caches need to be cleared. This however is a topic for someone who actually knows what Wolfram is doing under the hood.






share|improve this answer




















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "387"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f182427%2flist-generation-speed-change-at-800-iterations%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote













    Regarding this statement




    Looks like the rate of change of cost for creating a list is abruptly changing near this value.




    I have a different opinion. We can use RepeatedTiming to get a somewhat consistent measurement and check the runtime for one single Prime call taking the list creation out of the game



    data1 = Table[i, First[RepeatedTiming[Prime[i]]], i, 1, 100000, 
    100];
    ListLinePlot[data1]


    Mathematica graphics



    So we see the same jump, and I believe it is a switch from one algorithm to the next in calculating the primes. On closer inspection, you will find that the position of the jump appears to be precisely at n=78499. Why? I guess the reason is that we reached one million:



    Prime[78498]
    Prime[78499]
    (* 999983 *)
    (* 1000003 *)


    In the documentation we find




    Prime and PrimePi use sparse caching and sieving. For large n, the Lagarias–Miller–Odlyzko algorithm for PrimePi is used, based on asymptotic estimates of the density of primes, and is inverted to give Prime.




    So let's assume for n<78499 Mathematica uses caching. Then getting a result, mostly depends on how fast the lookup is. Of course, you need to take into account the real smart users, who, when using Prime[n] definitely will request Prime[n+1] (or 2,3,...).



    Let us make an experiment. Instead of using RepeatedTiming, we will simply access dn primes and measure the time. So we will touch each prime but we measure in chunks:



    With[dn = 100,
    dataDoForward =
    Table[i, First[AbsoluteTiming[Do[Prime[j], j, i, i + dn]]], i,
    1, 150000, dn]
    ];
    ListLinePlot[dataDoForward, PlotRange -> All]


    Mathematica graphics



    No surprise here. The same jump. However, what happens if we access the primes in the inner Do in reverse order?



    With[dn = 100,
    dataDoReverse =
    Table[i,
    First[AbsoluteTiming[Do[Prime[j], j, i + dn, i, -1]]], i, 1,
    150000, dn]
    ];
    ListLinePlot[dataDoReverse, PlotRange -> All]


    Mathematica graphics



    It seems the Wolfram Developers are counting on that we are somewhat reasonable people that don't do these stunts. Please pay particular attention to the order of the difference in time. For 100 primes around n=75000 we need under 0.1 microseconds, while the reverse access is almost 1000x slower.



    So my conclusion without official information is that primes under 1 million are heavily cached and accessing them loads some larger primes so that the next access is faster. For lager primes, an algorithm is used that really calculates the result. This seems stable in timing, no matter if you access chunks of primes forward or backwards



    ListLinePlot[dataDoForward, dataDoReverse, 
    PlotRange -> Automatic, 0, 0.0005,
    PlotLegends -> "Forward Access", "Backward Access"
    ]


    Mathematica graphics



    I cannot explain, why the forward access method is a bit slower at the beginning for primes larger 1 million, but it is possible that some caches need to be cleared. This however is a topic for someone who actually knows what Wolfram is doing under the hood.






    share|improve this answer
























      up vote
      3
      down vote













      Regarding this statement




      Looks like the rate of change of cost for creating a list is abruptly changing near this value.




      I have a different opinion. We can use RepeatedTiming to get a somewhat consistent measurement and check the runtime for one single Prime call taking the list creation out of the game



      data1 = Table[i, First[RepeatedTiming[Prime[i]]], i, 1, 100000, 
      100];
      ListLinePlot[data1]


      Mathematica graphics



      So we see the same jump, and I believe it is a switch from one algorithm to the next in calculating the primes. On closer inspection, you will find that the position of the jump appears to be precisely at n=78499. Why? I guess the reason is that we reached one million:



      Prime[78498]
      Prime[78499]
      (* 999983 *)
      (* 1000003 *)


      In the documentation we find




      Prime and PrimePi use sparse caching and sieving. For large n, the Lagarias–Miller–Odlyzko algorithm for PrimePi is used, based on asymptotic estimates of the density of primes, and is inverted to give Prime.




      So let's assume for n<78499 Mathematica uses caching. Then getting a result, mostly depends on how fast the lookup is. Of course, you need to take into account the real smart users, who, when using Prime[n] definitely will request Prime[n+1] (or 2,3,...).



      Let us make an experiment. Instead of using RepeatedTiming, we will simply access dn primes and measure the time. So we will touch each prime but we measure in chunks:



      With[dn = 100,
      dataDoForward =
      Table[i, First[AbsoluteTiming[Do[Prime[j], j, i, i + dn]]], i,
      1, 150000, dn]
      ];
      ListLinePlot[dataDoForward, PlotRange -> All]


      Mathematica graphics



      No surprise here. The same jump. However, what happens if we access the primes in the inner Do in reverse order?



      With[dn = 100,
      dataDoReverse =
      Table[i,
      First[AbsoluteTiming[Do[Prime[j], j, i + dn, i, -1]]], i, 1,
      150000, dn]
      ];
      ListLinePlot[dataDoReverse, PlotRange -> All]


      Mathematica graphics



      It seems the Wolfram Developers are counting on that we are somewhat reasonable people that don't do these stunts. Please pay particular attention to the order of the difference in time. For 100 primes around n=75000 we need under 0.1 microseconds, while the reverse access is almost 1000x slower.



      So my conclusion without official information is that primes under 1 million are heavily cached and accessing them loads some larger primes so that the next access is faster. For lager primes, an algorithm is used that really calculates the result. This seems stable in timing, no matter if you access chunks of primes forward or backwards



      ListLinePlot[dataDoForward, dataDoReverse, 
      PlotRange -> Automatic, 0, 0.0005,
      PlotLegends -> "Forward Access", "Backward Access"
      ]


      Mathematica graphics



      I cannot explain, why the forward access method is a bit slower at the beginning for primes larger 1 million, but it is possible that some caches need to be cleared. This however is a topic for someone who actually knows what Wolfram is doing under the hood.






      share|improve this answer






















        up vote
        3
        down vote










        up vote
        3
        down vote









        Regarding this statement




        Looks like the rate of change of cost for creating a list is abruptly changing near this value.




        I have a different opinion. We can use RepeatedTiming to get a somewhat consistent measurement and check the runtime for one single Prime call taking the list creation out of the game



        data1 = Table[i, First[RepeatedTiming[Prime[i]]], i, 1, 100000, 
        100];
        ListLinePlot[data1]


        Mathematica graphics



        So we see the same jump, and I believe it is a switch from one algorithm to the next in calculating the primes. On closer inspection, you will find that the position of the jump appears to be precisely at n=78499. Why? I guess the reason is that we reached one million:



        Prime[78498]
        Prime[78499]
        (* 999983 *)
        (* 1000003 *)


        In the documentation we find




        Prime and PrimePi use sparse caching and sieving. For large n, the Lagarias–Miller–Odlyzko algorithm for PrimePi is used, based on asymptotic estimates of the density of primes, and is inverted to give Prime.




        So let's assume for n<78499 Mathematica uses caching. Then getting a result, mostly depends on how fast the lookup is. Of course, you need to take into account the real smart users, who, when using Prime[n] definitely will request Prime[n+1] (or 2,3,...).



        Let us make an experiment. Instead of using RepeatedTiming, we will simply access dn primes and measure the time. So we will touch each prime but we measure in chunks:



        With[dn = 100,
        dataDoForward =
        Table[i, First[AbsoluteTiming[Do[Prime[j], j, i, i + dn]]], i,
        1, 150000, dn]
        ];
        ListLinePlot[dataDoForward, PlotRange -> All]


        Mathematica graphics



        No surprise here. The same jump. However, what happens if we access the primes in the inner Do in reverse order?



        With[dn = 100,
        dataDoReverse =
        Table[i,
        First[AbsoluteTiming[Do[Prime[j], j, i + dn, i, -1]]], i, 1,
        150000, dn]
        ];
        ListLinePlot[dataDoReverse, PlotRange -> All]


        Mathematica graphics



        It seems the Wolfram Developers are counting on that we are somewhat reasonable people that don't do these stunts. Please pay particular attention to the order of the difference in time. For 100 primes around n=75000 we need under 0.1 microseconds, while the reverse access is almost 1000x slower.



        So my conclusion without official information is that primes under 1 million are heavily cached and accessing them loads some larger primes so that the next access is faster. For lager primes, an algorithm is used that really calculates the result. This seems stable in timing, no matter if you access chunks of primes forward or backwards



        ListLinePlot[dataDoForward, dataDoReverse, 
        PlotRange -> Automatic, 0, 0.0005,
        PlotLegends -> "Forward Access", "Backward Access"
        ]


        Mathematica graphics



        I cannot explain, why the forward access method is a bit slower at the beginning for primes larger 1 million, but it is possible that some caches need to be cleared. This however is a topic for someone who actually knows what Wolfram is doing under the hood.






        share|improve this answer












        Regarding this statement




        Looks like the rate of change of cost for creating a list is abruptly changing near this value.




        I have a different opinion. We can use RepeatedTiming to get a somewhat consistent measurement and check the runtime for one single Prime call taking the list creation out of the game



        data1 = Table[i, First[RepeatedTiming[Prime[i]]], i, 1, 100000, 
        100];
        ListLinePlot[data1]


        Mathematica graphics



        So we see the same jump, and I believe it is a switch from one algorithm to the next in calculating the primes. On closer inspection, you will find that the position of the jump appears to be precisely at n=78499. Why? I guess the reason is that we reached one million:



        Prime[78498]
        Prime[78499]
        (* 999983 *)
        (* 1000003 *)


        In the documentation we find




        Prime and PrimePi use sparse caching and sieving. For large n, the Lagarias–Miller–Odlyzko algorithm for PrimePi is used, based on asymptotic estimates of the density of primes, and is inverted to give Prime.




        So let's assume for n<78499 Mathematica uses caching. Then getting a result, mostly depends on how fast the lookup is. Of course, you need to take into account the real smart users, who, when using Prime[n] definitely will request Prime[n+1] (or 2,3,...).



        Let us make an experiment. Instead of using RepeatedTiming, we will simply access dn primes and measure the time. So we will touch each prime but we measure in chunks:



        With[dn = 100,
        dataDoForward =
        Table[i, First[AbsoluteTiming[Do[Prime[j], j, i, i + dn]]], i,
        1, 150000, dn]
        ];
        ListLinePlot[dataDoForward, PlotRange -> All]


        Mathematica graphics



        No surprise here. The same jump. However, what happens if we access the primes in the inner Do in reverse order?



        With[dn = 100,
        dataDoReverse =
        Table[i,
        First[AbsoluteTiming[Do[Prime[j], j, i + dn, i, -1]]], i, 1,
        150000, dn]
        ];
        ListLinePlot[dataDoReverse, PlotRange -> All]


        Mathematica graphics



        It seems the Wolfram Developers are counting on that we are somewhat reasonable people that don't do these stunts. Please pay particular attention to the order of the difference in time. For 100 primes around n=75000 we need under 0.1 microseconds, while the reverse access is almost 1000x slower.



        So my conclusion without official information is that primes under 1 million are heavily cached and accessing them loads some larger primes so that the next access is faster. For lager primes, an algorithm is used that really calculates the result. This seems stable in timing, no matter if you access chunks of primes forward or backwards



        ListLinePlot[dataDoForward, dataDoReverse, 
        PlotRange -> Automatic, 0, 0.0005,
        PlotLegends -> "Forward Access", "Backward Access"
        ]


        Mathematica graphics



        I cannot explain, why the forward access method is a bit slower at the beginning for primes larger 1 million, but it is possible that some caches need to be cleared. This however is a topic for someone who actually knows what Wolfram is doing under the hood.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 1 hour ago









        halirutan♦

        93k5212405




        93k5212405



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f182427%2flist-generation-speed-change-at-800-iterations%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            Long meetings (6-7 hours a day): Being “babysat” by supervisor

            What does second last employer means? [closed]

            One-line joke