Why a simulation of a probability experiment is off by a factor of 10?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite
2












From a university homework assignment:
There are $8$ numbered cells and $12$ indistinct balls. All $12$ balls are randomly divided between all of the $8$ cells. What is the probability that there is not a single empty cell ($i.e.$ each cell has at least $1$ ball)?



The answer is $largefracbinom117binom197$ which is about $0.0065$. I reached this result independently and it was confirmed by the official homework solution of the university.



A friend of mine and I independently wrote python simulations that run the experiment many times (tested up to $1,000,000$). We used both pythons' random generator and several randomly generated lists from www.random.org. Results were similar and consistently hovering around $0.09$ which is a factor of $10$ or even a bit more off from the expected theoretical result.



Have we made some wrong assumptions?
Any ideas for this discrepancy?
Thanks.



P.S
I will paste the python code that I wrote, maybe there is some faulty logic there.










share|cite|improve this question























  • I suppose the issue must be in "randomly divided." What does it mean exactly? What is the distribution of probabilities between different outcomes? Was the same distribution simulated in the experiment and used in the formula?
    – Alexey
    14 mins ago










  • If you suppose that all possible distributions of the numbers of balls (not of balls themselves) between cells have the same probability, I wonder how you could simulate this in an experiment...
    – Alexey
    12 mins ago















up vote
4
down vote

favorite
2












From a university homework assignment:
There are $8$ numbered cells and $12$ indistinct balls. All $12$ balls are randomly divided between all of the $8$ cells. What is the probability that there is not a single empty cell ($i.e.$ each cell has at least $1$ ball)?



The answer is $largefracbinom117binom197$ which is about $0.0065$. I reached this result independently and it was confirmed by the official homework solution of the university.



A friend of mine and I independently wrote python simulations that run the experiment many times (tested up to $1,000,000$). We used both pythons' random generator and several randomly generated lists from www.random.org. Results were similar and consistently hovering around $0.09$ which is a factor of $10$ or even a bit more off from the expected theoretical result.



Have we made some wrong assumptions?
Any ideas for this discrepancy?
Thanks.



P.S
I will paste the python code that I wrote, maybe there is some faulty logic there.










share|cite|improve this question























  • I suppose the issue must be in "randomly divided." What does it mean exactly? What is the distribution of probabilities between different outcomes? Was the same distribution simulated in the experiment and used in the formula?
    – Alexey
    14 mins ago










  • If you suppose that all possible distributions of the numbers of balls (not of balls themselves) between cells have the same probability, I wonder how you could simulate this in an experiment...
    – Alexey
    12 mins ago













up vote
4
down vote

favorite
2









up vote
4
down vote

favorite
2






2





From a university homework assignment:
There are $8$ numbered cells and $12$ indistinct balls. All $12$ balls are randomly divided between all of the $8$ cells. What is the probability that there is not a single empty cell ($i.e.$ each cell has at least $1$ ball)?



The answer is $largefracbinom117binom197$ which is about $0.0065$. I reached this result independently and it was confirmed by the official homework solution of the university.



A friend of mine and I independently wrote python simulations that run the experiment many times (tested up to $1,000,000$). We used both pythons' random generator and several randomly generated lists from www.random.org. Results were similar and consistently hovering around $0.09$ which is a factor of $10$ or even a bit more off from the expected theoretical result.



Have we made some wrong assumptions?
Any ideas for this discrepancy?
Thanks.



P.S
I will paste the python code that I wrote, maybe there is some faulty logic there.










share|cite|improve this question















From a university homework assignment:
There are $8$ numbered cells and $12$ indistinct balls. All $12$ balls are randomly divided between all of the $8$ cells. What is the probability that there is not a single empty cell ($i.e.$ each cell has at least $1$ ball)?



The answer is $largefracbinom117binom197$ which is about $0.0065$. I reached this result independently and it was confirmed by the official homework solution of the university.



A friend of mine and I independently wrote python simulations that run the experiment many times (tested up to $1,000,000$). We used both pythons' random generator and several randomly generated lists from www.random.org. Results were similar and consistently hovering around $0.09$ which is a factor of $10$ or even a bit more off from the expected theoretical result.



Have we made some wrong assumptions?
Any ideas for this discrepancy?
Thanks.



P.S
I will paste the python code that I wrote, maybe there is some faulty logic there.







probability simulation python






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited 23 mins ago

























asked 1 hour ago









Shmuel Levinson

213




213











  • I suppose the issue must be in "randomly divided." What does it mean exactly? What is the distribution of probabilities between different outcomes? Was the same distribution simulated in the experiment and used in the formula?
    – Alexey
    14 mins ago










  • If you suppose that all possible distributions of the numbers of balls (not of balls themselves) between cells have the same probability, I wonder how you could simulate this in an experiment...
    – Alexey
    12 mins ago

















  • I suppose the issue must be in "randomly divided." What does it mean exactly? What is the distribution of probabilities between different outcomes? Was the same distribution simulated in the experiment and used in the formula?
    – Alexey
    14 mins ago










  • If you suppose that all possible distributions of the numbers of balls (not of balls themselves) between cells have the same probability, I wonder how you could simulate this in an experiment...
    – Alexey
    12 mins ago
















I suppose the issue must be in "randomly divided." What does it mean exactly? What is the distribution of probabilities between different outcomes? Was the same distribution simulated in the experiment and used in the formula?
– Alexey
14 mins ago




I suppose the issue must be in "randomly divided." What does it mean exactly? What is the distribution of probabilities between different outcomes? Was the same distribution simulated in the experiment and used in the formula?
– Alexey
14 mins ago












If you suppose that all possible distributions of the numbers of balls (not of balls themselves) between cells have the same probability, I wonder how you could simulate this in an experiment...
– Alexey
12 mins ago





If you suppose that all possible distributions of the numbers of balls (not of balls themselves) between cells have the same probability, I wonder how you could simulate this in an experiment...
– Alexey
12 mins ago











2 Answers
2






active

oldest

votes

















up vote
4
down vote













In reality, you will find it very difficult to put the balls in the cells without distinguishing between the balls, especially if you want equal probabilities so as to use counting methods for simulation. Suppose you wanted to consider the probability all the balls went into the first cell: with distinguishable balls this probability is $frac18^12$ and is easily simulated though a rare occurrence; with indistinguishable balls it is $frac119 choose 7$ over a million times more likely but difficult to simulate



If the balls are distinguishable, the probability all eight boxes are full is $$frac8! , S_2(12,8)8^12$$ where $S_2(n,k)$ is a Stirling number of the second kind and $S_2(12,8)=159027$. That gives a probability that each cell has at least one ball of about $0.0933$. Is this similar to your simulation?



If you really want to simulate the indistinguishable balls case, despite it not being realistic physically outside Bose–Einstein condensate at temperatures close to absolute zero, you could use a stars and bars analogy. Choose $7$ distinct positions for the cells walls from possible positions $0,1,2,3,ldots,18$ for the balls and cell walls; a success is when none of the cell walls are at positions $0$ or $18$ and no pair of them are consecutive






share|cite|improve this answer






















  • This is indeed similar to my simulation. Thanks for the detailed answer!
    – Shmuel Levinson
    18 mins ago

















up vote
1
down vote













Consider the set $D$ of ways to distribute $12$ balls labelled [abcdefghijkl] among $8$ cells numbered [01234567]. This set has $8^12approx7times10^10$ elements.



Now consider the set $I$ of distinguishable ways to populate those same $8$ cells [01234567] with $12$ indistinct balls. This set has $19choose7approx 5times10^4$ elements.



The assignment asks you to compute a probability of an event over the uniform distribution on $I$, if not in so many words. In principle, you could approximate this probability by sampling from the uniform distribution on $I$. But your strategy is to sample from the uniform distribution on $D$, and then map each sample to $I$! That's not the same.



Instead of taking the average of all the results, you need to take a weighted average, such that the weight compensates for the number of elements in $D$ that map to the same element of $I$. Hint, it's something like this:



weight = 1
for cell_population in cells:
weight *= math.factorial(cell_population)


At least, that gets the right answer. Rigorously justifying that formula as a consequence of the mapping between $D$ and $I$ is left as an exercise to the reader.






share|cite|improve this answer




















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2975845%2fwhy-a-simulation-of-a-probability-experiment-is-off-by-a-factor-of-10%23new-answer', 'question_page');

    );

    Post as a guest






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    4
    down vote













    In reality, you will find it very difficult to put the balls in the cells without distinguishing between the balls, especially if you want equal probabilities so as to use counting methods for simulation. Suppose you wanted to consider the probability all the balls went into the first cell: with distinguishable balls this probability is $frac18^12$ and is easily simulated though a rare occurrence; with indistinguishable balls it is $frac119 choose 7$ over a million times more likely but difficult to simulate



    If the balls are distinguishable, the probability all eight boxes are full is $$frac8! , S_2(12,8)8^12$$ where $S_2(n,k)$ is a Stirling number of the second kind and $S_2(12,8)=159027$. That gives a probability that each cell has at least one ball of about $0.0933$. Is this similar to your simulation?



    If you really want to simulate the indistinguishable balls case, despite it not being realistic physically outside Bose–Einstein condensate at temperatures close to absolute zero, you could use a stars and bars analogy. Choose $7$ distinct positions for the cells walls from possible positions $0,1,2,3,ldots,18$ for the balls and cell walls; a success is when none of the cell walls are at positions $0$ or $18$ and no pair of them are consecutive






    share|cite|improve this answer






















    • This is indeed similar to my simulation. Thanks for the detailed answer!
      – Shmuel Levinson
      18 mins ago














    up vote
    4
    down vote













    In reality, you will find it very difficult to put the balls in the cells without distinguishing between the balls, especially if you want equal probabilities so as to use counting methods for simulation. Suppose you wanted to consider the probability all the balls went into the first cell: with distinguishable balls this probability is $frac18^12$ and is easily simulated though a rare occurrence; with indistinguishable balls it is $frac119 choose 7$ over a million times more likely but difficult to simulate



    If the balls are distinguishable, the probability all eight boxes are full is $$frac8! , S_2(12,8)8^12$$ where $S_2(n,k)$ is a Stirling number of the second kind and $S_2(12,8)=159027$. That gives a probability that each cell has at least one ball of about $0.0933$. Is this similar to your simulation?



    If you really want to simulate the indistinguishable balls case, despite it not being realistic physically outside Bose–Einstein condensate at temperatures close to absolute zero, you could use a stars and bars analogy. Choose $7$ distinct positions for the cells walls from possible positions $0,1,2,3,ldots,18$ for the balls and cell walls; a success is when none of the cell walls are at positions $0$ or $18$ and no pair of them are consecutive






    share|cite|improve this answer






















    • This is indeed similar to my simulation. Thanks for the detailed answer!
      – Shmuel Levinson
      18 mins ago












    up vote
    4
    down vote










    up vote
    4
    down vote









    In reality, you will find it very difficult to put the balls in the cells without distinguishing between the balls, especially if you want equal probabilities so as to use counting methods for simulation. Suppose you wanted to consider the probability all the balls went into the first cell: with distinguishable balls this probability is $frac18^12$ and is easily simulated though a rare occurrence; with indistinguishable balls it is $frac119 choose 7$ over a million times more likely but difficult to simulate



    If the balls are distinguishable, the probability all eight boxes are full is $$frac8! , S_2(12,8)8^12$$ where $S_2(n,k)$ is a Stirling number of the second kind and $S_2(12,8)=159027$. That gives a probability that each cell has at least one ball of about $0.0933$. Is this similar to your simulation?



    If you really want to simulate the indistinguishable balls case, despite it not being realistic physically outside Bose–Einstein condensate at temperatures close to absolute zero, you could use a stars and bars analogy. Choose $7$ distinct positions for the cells walls from possible positions $0,1,2,3,ldots,18$ for the balls and cell walls; a success is when none of the cell walls are at positions $0$ or $18$ and no pair of them are consecutive






    share|cite|improve this answer














    In reality, you will find it very difficult to put the balls in the cells without distinguishing between the balls, especially if you want equal probabilities so as to use counting methods for simulation. Suppose you wanted to consider the probability all the balls went into the first cell: with distinguishable balls this probability is $frac18^12$ and is easily simulated though a rare occurrence; with indistinguishable balls it is $frac119 choose 7$ over a million times more likely but difficult to simulate



    If the balls are distinguishable, the probability all eight boxes are full is $$frac8! , S_2(12,8)8^12$$ where $S_2(n,k)$ is a Stirling number of the second kind and $S_2(12,8)=159027$. That gives a probability that each cell has at least one ball of about $0.0933$. Is this similar to your simulation?



    If you really want to simulate the indistinguishable balls case, despite it not being realistic physically outside Bose–Einstein condensate at temperatures close to absolute zero, you could use a stars and bars analogy. Choose $7$ distinct positions for the cells walls from possible positions $0,1,2,3,ldots,18$ for the balls and cell walls; a success is when none of the cell walls are at positions $0$ or $18$ and no pair of them are consecutive







    share|cite|improve this answer














    share|cite|improve this answer



    share|cite|improve this answer








    edited 35 mins ago

























    answered 49 mins ago









    Henry

    95.6k473152




    95.6k473152











    • This is indeed similar to my simulation. Thanks for the detailed answer!
      – Shmuel Levinson
      18 mins ago
















    • This is indeed similar to my simulation. Thanks for the detailed answer!
      – Shmuel Levinson
      18 mins ago















    This is indeed similar to my simulation. Thanks for the detailed answer!
    – Shmuel Levinson
    18 mins ago




    This is indeed similar to my simulation. Thanks for the detailed answer!
    – Shmuel Levinson
    18 mins ago










    up vote
    1
    down vote













    Consider the set $D$ of ways to distribute $12$ balls labelled [abcdefghijkl] among $8$ cells numbered [01234567]. This set has $8^12approx7times10^10$ elements.



    Now consider the set $I$ of distinguishable ways to populate those same $8$ cells [01234567] with $12$ indistinct balls. This set has $19choose7approx 5times10^4$ elements.



    The assignment asks you to compute a probability of an event over the uniform distribution on $I$, if not in so many words. In principle, you could approximate this probability by sampling from the uniform distribution on $I$. But your strategy is to sample from the uniform distribution on $D$, and then map each sample to $I$! That's not the same.



    Instead of taking the average of all the results, you need to take a weighted average, such that the weight compensates for the number of elements in $D$ that map to the same element of $I$. Hint, it's something like this:



    weight = 1
    for cell_population in cells:
    weight *= math.factorial(cell_population)


    At least, that gets the right answer. Rigorously justifying that formula as a consequence of the mapping between $D$ and $I$ is left as an exercise to the reader.






    share|cite|improve this answer
























      up vote
      1
      down vote













      Consider the set $D$ of ways to distribute $12$ balls labelled [abcdefghijkl] among $8$ cells numbered [01234567]. This set has $8^12approx7times10^10$ elements.



      Now consider the set $I$ of distinguishable ways to populate those same $8$ cells [01234567] with $12$ indistinct balls. This set has $19choose7approx 5times10^4$ elements.



      The assignment asks you to compute a probability of an event over the uniform distribution on $I$, if not in so many words. In principle, you could approximate this probability by sampling from the uniform distribution on $I$. But your strategy is to sample from the uniform distribution on $D$, and then map each sample to $I$! That's not the same.



      Instead of taking the average of all the results, you need to take a weighted average, such that the weight compensates for the number of elements in $D$ that map to the same element of $I$. Hint, it's something like this:



      weight = 1
      for cell_population in cells:
      weight *= math.factorial(cell_population)


      At least, that gets the right answer. Rigorously justifying that formula as a consequence of the mapping between $D$ and $I$ is left as an exercise to the reader.






      share|cite|improve this answer






















        up vote
        1
        down vote










        up vote
        1
        down vote









        Consider the set $D$ of ways to distribute $12$ balls labelled [abcdefghijkl] among $8$ cells numbered [01234567]. This set has $8^12approx7times10^10$ elements.



        Now consider the set $I$ of distinguishable ways to populate those same $8$ cells [01234567] with $12$ indistinct balls. This set has $19choose7approx 5times10^4$ elements.



        The assignment asks you to compute a probability of an event over the uniform distribution on $I$, if not in so many words. In principle, you could approximate this probability by sampling from the uniform distribution on $I$. But your strategy is to sample from the uniform distribution on $D$, and then map each sample to $I$! That's not the same.



        Instead of taking the average of all the results, you need to take a weighted average, such that the weight compensates for the number of elements in $D$ that map to the same element of $I$. Hint, it's something like this:



        weight = 1
        for cell_population in cells:
        weight *= math.factorial(cell_population)


        At least, that gets the right answer. Rigorously justifying that formula as a consequence of the mapping between $D$ and $I$ is left as an exercise to the reader.






        share|cite|improve this answer












        Consider the set $D$ of ways to distribute $12$ balls labelled [abcdefghijkl] among $8$ cells numbered [01234567]. This set has $8^12approx7times10^10$ elements.



        Now consider the set $I$ of distinguishable ways to populate those same $8$ cells [01234567] with $12$ indistinct balls. This set has $19choose7approx 5times10^4$ elements.



        The assignment asks you to compute a probability of an event over the uniform distribution on $I$, if not in so many words. In principle, you could approximate this probability by sampling from the uniform distribution on $I$. But your strategy is to sample from the uniform distribution on $D$, and then map each sample to $I$! That's not the same.



        Instead of taking the average of all the results, you need to take a weighted average, such that the weight compensates for the number of elements in $D$ that map to the same element of $I$. Hint, it's something like this:



        weight = 1
        for cell_population in cells:
        weight *= math.factorial(cell_population)


        At least, that gets the right answer. Rigorously justifying that formula as a consequence of the mapping between $D$ and $I$ is left as an exercise to the reader.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered 31 mins ago









        Chris Culter

        19.3k43280




        19.3k43280



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2975845%2fwhy-a-simulation-of-a-probability-experiment-is-off-by-a-factor-of-10%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What does second last employer means? [closed]

            List of Gilmore Girls characters

            Confectionery