Basic probability simulations in Python 3

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite
1












I have been learning Probability from Statistics 110 and trying to simulate problems that have an unintuitive answer



from numpy import cumsum
from statistics import mean
from numpy.random import exponential
from random import randint, sample, uniform
from bisect import bisect_left

def mean_of_experiments(experiment_func, N=100000):
'''Decorator to repeat any Bernoulli trial N times and return probability of success'''
def wrapper(*args, **kwargs):
return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
return wrapper

@mean_of_experiments
def common_birthday(k):
'''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
rands = [randint(1, 365) for _ in range(k)]
return len(rands) != len(set(rands))

@mean_of_experiments
def matching(k=52):
'''Simulates an experiment to permute 'k' cards and check if any jth card's value is j'''
idx_labels = enumerate(sample(range(k), k))
return any(idx == label for idx, label in idx_labels)

@mean_of_experiments
def dice(n, c):
'''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
return [randint(1, 6) for _ in range(n)].count(6) >= c

def boardings(scale=5.0, N=1_00_000):
'''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
arrivals = cumsum([exponential(scale=scale) for _ in range(N)])

@mean_of_experiments
def wait():
boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
return arrivals[boarding_idx] - missed_bus

return wait()


Probability Problems:




  1. common_birthday: Given k people, what's the probability that any 2 people share a birthday?


  2. matching: Given 52 cards with a unique label in 0...n-1, they are shuffled. What's the probability that there's any j^th card's label is j?


  3. dice: Problem posed by a gambler to Newton


  4. boardings: Given that a bus company's buses follow a Poisson process, if a person visits a stop at some random time, what was that inter-arrival time? From Tsitsiklis' lectures

What I wish I could do in my code is dynamically set the value of N in the decorator mean_of_experiments. Is that possible?










share|improve this question









New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 1




    Welcome to Code Review! Thank you for writing a question that's both clear and complete. Could you please add what version of Python you wrote this for?
    – Mast
    2 hours ago






  • 1




    @Mast: Thanks :). Python 3, I edited the question too.
    – kamalbanga
    2 hours ago














up vote
3
down vote

favorite
1












I have been learning Probability from Statistics 110 and trying to simulate problems that have an unintuitive answer



from numpy import cumsum
from statistics import mean
from numpy.random import exponential
from random import randint, sample, uniform
from bisect import bisect_left

def mean_of_experiments(experiment_func, N=100000):
'''Decorator to repeat any Bernoulli trial N times and return probability of success'''
def wrapper(*args, **kwargs):
return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
return wrapper

@mean_of_experiments
def common_birthday(k):
'''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
rands = [randint(1, 365) for _ in range(k)]
return len(rands) != len(set(rands))

@mean_of_experiments
def matching(k=52):
'''Simulates an experiment to permute 'k' cards and check if any jth card's value is j'''
idx_labels = enumerate(sample(range(k), k))
return any(idx == label for idx, label in idx_labels)

@mean_of_experiments
def dice(n, c):
'''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
return [randint(1, 6) for _ in range(n)].count(6) >= c

def boardings(scale=5.0, N=1_00_000):
'''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
arrivals = cumsum([exponential(scale=scale) for _ in range(N)])

@mean_of_experiments
def wait():
boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
return arrivals[boarding_idx] - missed_bus

return wait()


Probability Problems:




  1. common_birthday: Given k people, what's the probability that any 2 people share a birthday?


  2. matching: Given 52 cards with a unique label in 0...n-1, they are shuffled. What's the probability that there's any j^th card's label is j?


  3. dice: Problem posed by a gambler to Newton


  4. boardings: Given that a bus company's buses follow a Poisson process, if a person visits a stop at some random time, what was that inter-arrival time? From Tsitsiklis' lectures

What I wish I could do in my code is dynamically set the value of N in the decorator mean_of_experiments. Is that possible?










share|improve this question









New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 1




    Welcome to Code Review! Thank you for writing a question that's both clear and complete. Could you please add what version of Python you wrote this for?
    – Mast
    2 hours ago






  • 1




    @Mast: Thanks :). Python 3, I edited the question too.
    – kamalbanga
    2 hours ago












up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





I have been learning Probability from Statistics 110 and trying to simulate problems that have an unintuitive answer



from numpy import cumsum
from statistics import mean
from numpy.random import exponential
from random import randint, sample, uniform
from bisect import bisect_left

def mean_of_experiments(experiment_func, N=100000):
'''Decorator to repeat any Bernoulli trial N times and return probability of success'''
def wrapper(*args, **kwargs):
return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
return wrapper

@mean_of_experiments
def common_birthday(k):
'''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
rands = [randint(1, 365) for _ in range(k)]
return len(rands) != len(set(rands))

@mean_of_experiments
def matching(k=52):
'''Simulates an experiment to permute 'k' cards and check if any jth card's value is j'''
idx_labels = enumerate(sample(range(k), k))
return any(idx == label for idx, label in idx_labels)

@mean_of_experiments
def dice(n, c):
'''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
return [randint(1, 6) for _ in range(n)].count(6) >= c

def boardings(scale=5.0, N=1_00_000):
'''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
arrivals = cumsum([exponential(scale=scale) for _ in range(N)])

@mean_of_experiments
def wait():
boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
return arrivals[boarding_idx] - missed_bus

return wait()


Probability Problems:




  1. common_birthday: Given k people, what's the probability that any 2 people share a birthday?


  2. matching: Given 52 cards with a unique label in 0...n-1, they are shuffled. What's the probability that there's any j^th card's label is j?


  3. dice: Problem posed by a gambler to Newton


  4. boardings: Given that a bus company's buses follow a Poisson process, if a person visits a stop at some random time, what was that inter-arrival time? From Tsitsiklis' lectures

What I wish I could do in my code is dynamically set the value of N in the decorator mean_of_experiments. Is that possible?










share|improve this question









New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I have been learning Probability from Statistics 110 and trying to simulate problems that have an unintuitive answer



from numpy import cumsum
from statistics import mean
from numpy.random import exponential
from random import randint, sample, uniform
from bisect import bisect_left

def mean_of_experiments(experiment_func, N=100000):
'''Decorator to repeat any Bernoulli trial N times and return probability of success'''
def wrapper(*args, **kwargs):
return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
return wrapper

@mean_of_experiments
def common_birthday(k):
'''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
rands = [randint(1, 365) for _ in range(k)]
return len(rands) != len(set(rands))

@mean_of_experiments
def matching(k=52):
'''Simulates an experiment to permute 'k' cards and check if any jth card's value is j'''
idx_labels = enumerate(sample(range(k), k))
return any(idx == label for idx, label in idx_labels)

@mean_of_experiments
def dice(n, c):
'''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
return [randint(1, 6) for _ in range(n)].count(6) >= c

def boardings(scale=5.0, N=1_00_000):
'''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
arrivals = cumsum([exponential(scale=scale) for _ in range(N)])

@mean_of_experiments
def wait():
boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
return arrivals[boarding_idx] - missed_bus

return wait()


Probability Problems:




  1. common_birthday: Given k people, what's the probability that any 2 people share a birthday?


  2. matching: Given 52 cards with a unique label in 0...n-1, they are shuffled. What's the probability that there's any j^th card's label is j?


  3. dice: Problem posed by a gambler to Newton


  4. boardings: Given that a bus company's buses follow a Poisson process, if a person visits a stop at some random time, what was that inter-arrival time? From Tsitsiklis' lectures

What I wish I could do in my code is dynamically set the value of N in the decorator mean_of_experiments. Is that possible?







python python-3.x statistics simulation






share|improve this question









New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 2 hours ago





















New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 hours ago









kamalbanga

1184




1184




New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







  • 1




    Welcome to Code Review! Thank you for writing a question that's both clear and complete. Could you please add what version of Python you wrote this for?
    – Mast
    2 hours ago






  • 1




    @Mast: Thanks :). Python 3, I edited the question too.
    – kamalbanga
    2 hours ago












  • 1




    Welcome to Code Review! Thank you for writing a question that's both clear and complete. Could you please add what version of Python you wrote this for?
    – Mast
    2 hours ago






  • 1




    @Mast: Thanks :). Python 3, I edited the question too.
    – kamalbanga
    2 hours ago







1




1




Welcome to Code Review! Thank you for writing a question that's both clear and complete. Could you please add what version of Python you wrote this for?
– Mast
2 hours ago




Welcome to Code Review! Thank you for writing a question that's both clear and complete. Could you please add what version of Python you wrote this for?
– Mast
2 hours ago




1




1




@Mast: Thanks :). Python 3, I edited the question too.
– kamalbanga
2 hours ago




@Mast: Thanks :). Python 3, I edited the question too.
– kamalbanga
2 hours ago










2 Answers
2






active

oldest

votes

















up vote
1
down vote



accepted










functools.wraps



Unrelated to your main question, I advice you to use functools.wraps. This way, your methods metadata get transferred to the wrapped method.
For example dice.__doc__ returns None without it, but the doctstring if you include it.



repetitions



It is possible to programatically change the number of repetitions per function. If you look at functools.lru_cache, where you need to specify the size of the cache too, you can see you just need an extra level of wrapper function:



def mean_of_experiments(N=100_000):
def inner_decorator(experiment_func):
'''Decorator to repeat any Bernoulli trial N times and return probability of success'''
@wraps(experiment_func)
def wrapper(*args, **kwargs):
# print(f"N repititions")
return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
return wrapper
return inner_decorator


where the inner_decorator is the old mean_of_experiments method



Dynamically set repetitions



A decorator is nothing but syntactic sugar for decorator(func), so you can do it by not decorating the functions themselves, but doing the 'decorating' when calling them:



mean_of_experiments(N=100)(dice(100, 20))


Dynamically set repetitions 2:



Another approach is to pass the repetitions on with the kwargs to the experiment_func



def mean_of_experiments_2(experiment_func):

'''Decorator to repeat any Bernoulli trial N times and return probability of success'''
def wrapper(*args, **kwargs):
repetitions = kwargs.pop('repetitions', 100_000)
# print(f"repetitions repetitions")
return round(mean(experiment_func(*args, **kwargs) for _ in range(repetitions)), 3)
return wrapper

@mean_of_experiments_2
def dice(n, c):
'''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
return [randint(1, 6) for _ in range(n)].count(6) >= c


and then call it like this: dice(6, 4, repetitions=100)



The main caveat here is not to pick repetitions as argument to any of the experiment_funcs



Further remarks



_ in ints



you wrote 100000 in 2 ways: 100000 and 1_00_000. The second way looks incorrect to me (it is syntactically correct, but I would group numbers per 3). The way I would write it is 100_000. You can pick another way, but stay consistent



common_birthday



since the rands as you make it will always contain k elements, it would be more efficient to skip this list, and immediately make the set:



def common_birthday(k):
'''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
rands = randint(1, 365) for _ in range(k)
return len(rands) != k


dice



For large n, this can become a very long list. An alternative approach would be to either use a collections.Counter, or use sum



return sum(randint(1, 6) == 6 for _ in range(n)) >= c


or



return Counter(randint(1, 6) for _ in range(n))[6] >= c


boardings



If, instead of depending on numpy you want a native implementations, you can use itertools.accumulate and random.expovariate.



def boardings_native(scale=5.0, N=100_000):
'''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
arrivals = list(accumulate(expovariate(lambd=1/scale) for _ in range(N)))

@mean_of_experiments(3)
def wait():
boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
return arrivals[boarding_idx] - missed_bus

return wait()


If I look at the performance, it is even faster than the numpy implementation, but that is probably because generating the arrivals is an iteration instead of a vectorised operation



%timeit boardings_native()



52 ms ± 1.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)



%timeit boardings()



143 ms ± 713 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)






share|improve this answer






















  • Wow! All these seem great suggestions. Regarding that decorator, what I meant is that moving beyond decorators, is there some construct in which while calling a function, say, common_birthday I could set the value for N. So if I am okay with low-confidence answer I set N=100 and if I want answer with high confidence, I could set N=1000000.
    – kamalbanga
    1 hour ago











  • I added 2 approaches to do this dynamically
    – Maarten Fabré
    53 mins ago










  • I am amazed again! I like the second approach so much! Exactly what I wished for, since it can keep a default value; in case I want to set it to some other value, I could do that too. Thanks :)
    – kamalbanga
    43 mins ago

















up vote
0
down vote













There is a function choices in random library that returns a list of elements chosen from the population with replacement. It could simplify the methods common_birthday and dice. Though, I believe, this decreases readability by a little.



from random import choices

def common_birthday(k):
return len(set(choices(range(1, 366), k=k))) != k

def dice(n, c):
return choices(range(1, 7), k=n).count(6) >= c


I removed the docstring and decorator for exposition of business logic.






share|improve this answer








New contributor




kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    kamalbanga is a new contributor. Be nice, and check out our Code of Conduct.









     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f204384%2fbasic-probability-simulations-in-python-3%23new-answer', 'question_page');

    );

    Post as a guest






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    functools.wraps



    Unrelated to your main question, I advice you to use functools.wraps. This way, your methods metadata get transferred to the wrapped method.
    For example dice.__doc__ returns None without it, but the doctstring if you include it.



    repetitions



    It is possible to programatically change the number of repetitions per function. If you look at functools.lru_cache, where you need to specify the size of the cache too, you can see you just need an extra level of wrapper function:



    def mean_of_experiments(N=100_000):
    def inner_decorator(experiment_func):
    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    @wraps(experiment_func)
    def wrapper(*args, **kwargs):
    # print(f"N repititions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
    return wrapper
    return inner_decorator


    where the inner_decorator is the old mean_of_experiments method



    Dynamically set repetitions



    A decorator is nothing but syntactic sugar for decorator(func), so you can do it by not decorating the functions themselves, but doing the 'decorating' when calling them:



    mean_of_experiments(N=100)(dice(100, 20))


    Dynamically set repetitions 2:



    Another approach is to pass the repetitions on with the kwargs to the experiment_func



    def mean_of_experiments_2(experiment_func):

    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    def wrapper(*args, **kwargs):
    repetitions = kwargs.pop('repetitions', 100_000)
    # print(f"repetitions repetitions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(repetitions)), 3)
    return wrapper

    @mean_of_experiments_2
    def dice(n, c):
    '''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
    return [randint(1, 6) for _ in range(n)].count(6) >= c


    and then call it like this: dice(6, 4, repetitions=100)



    The main caveat here is not to pick repetitions as argument to any of the experiment_funcs



    Further remarks



    _ in ints



    you wrote 100000 in 2 ways: 100000 and 1_00_000. The second way looks incorrect to me (it is syntactically correct, but I would group numbers per 3). The way I would write it is 100_000. You can pick another way, but stay consistent



    common_birthday



    since the rands as you make it will always contain k elements, it would be more efficient to skip this list, and immediately make the set:



    def common_birthday(k):
    '''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
    rands = randint(1, 365) for _ in range(k)
    return len(rands) != k


    dice



    For large n, this can become a very long list. An alternative approach would be to either use a collections.Counter, or use sum



    return sum(randint(1, 6) == 6 for _ in range(n)) >= c


    or



    return Counter(randint(1, 6) for _ in range(n))[6] >= c


    boardings



    If, instead of depending on numpy you want a native implementations, you can use itertools.accumulate and random.expovariate.



    def boardings_native(scale=5.0, N=100_000):
    '''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
    arrivals = list(accumulate(expovariate(lambd=1/scale) for _ in range(N)))

    @mean_of_experiments(3)
    def wait():
    boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
    missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
    return arrivals[boarding_idx] - missed_bus

    return wait()


    If I look at the performance, it is even faster than the numpy implementation, but that is probably because generating the arrivals is an iteration instead of a vectorised operation



    %timeit boardings_native()



    52 ms ± 1.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)



    %timeit boardings()



    143 ms ± 713 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)






    share|improve this answer






















    • Wow! All these seem great suggestions. Regarding that decorator, what I meant is that moving beyond decorators, is there some construct in which while calling a function, say, common_birthday I could set the value for N. So if I am okay with low-confidence answer I set N=100 and if I want answer with high confidence, I could set N=1000000.
      – kamalbanga
      1 hour ago











    • I added 2 approaches to do this dynamically
      – Maarten Fabré
      53 mins ago










    • I am amazed again! I like the second approach so much! Exactly what I wished for, since it can keep a default value; in case I want to set it to some other value, I could do that too. Thanks :)
      – kamalbanga
      43 mins ago














    up vote
    1
    down vote



    accepted










    functools.wraps



    Unrelated to your main question, I advice you to use functools.wraps. This way, your methods metadata get transferred to the wrapped method.
    For example dice.__doc__ returns None without it, but the doctstring if you include it.



    repetitions



    It is possible to programatically change the number of repetitions per function. If you look at functools.lru_cache, where you need to specify the size of the cache too, you can see you just need an extra level of wrapper function:



    def mean_of_experiments(N=100_000):
    def inner_decorator(experiment_func):
    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    @wraps(experiment_func)
    def wrapper(*args, **kwargs):
    # print(f"N repititions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
    return wrapper
    return inner_decorator


    where the inner_decorator is the old mean_of_experiments method



    Dynamically set repetitions



    A decorator is nothing but syntactic sugar for decorator(func), so you can do it by not decorating the functions themselves, but doing the 'decorating' when calling them:



    mean_of_experiments(N=100)(dice(100, 20))


    Dynamically set repetitions 2:



    Another approach is to pass the repetitions on with the kwargs to the experiment_func



    def mean_of_experiments_2(experiment_func):

    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    def wrapper(*args, **kwargs):
    repetitions = kwargs.pop('repetitions', 100_000)
    # print(f"repetitions repetitions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(repetitions)), 3)
    return wrapper

    @mean_of_experiments_2
    def dice(n, c):
    '''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
    return [randint(1, 6) for _ in range(n)].count(6) >= c


    and then call it like this: dice(6, 4, repetitions=100)



    The main caveat here is not to pick repetitions as argument to any of the experiment_funcs



    Further remarks



    _ in ints



    you wrote 100000 in 2 ways: 100000 and 1_00_000. The second way looks incorrect to me (it is syntactically correct, but I would group numbers per 3). The way I would write it is 100_000. You can pick another way, but stay consistent



    common_birthday



    since the rands as you make it will always contain k elements, it would be more efficient to skip this list, and immediately make the set:



    def common_birthday(k):
    '''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
    rands = randint(1, 365) for _ in range(k)
    return len(rands) != k


    dice



    For large n, this can become a very long list. An alternative approach would be to either use a collections.Counter, or use sum



    return sum(randint(1, 6) == 6 for _ in range(n)) >= c


    or



    return Counter(randint(1, 6) for _ in range(n))[6] >= c


    boardings



    If, instead of depending on numpy you want a native implementations, you can use itertools.accumulate and random.expovariate.



    def boardings_native(scale=5.0, N=100_000):
    '''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
    arrivals = list(accumulate(expovariate(lambd=1/scale) for _ in range(N)))

    @mean_of_experiments(3)
    def wait():
    boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
    missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
    return arrivals[boarding_idx] - missed_bus

    return wait()


    If I look at the performance, it is even faster than the numpy implementation, but that is probably because generating the arrivals is an iteration instead of a vectorised operation



    %timeit boardings_native()



    52 ms ± 1.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)



    %timeit boardings()



    143 ms ± 713 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)






    share|improve this answer






















    • Wow! All these seem great suggestions. Regarding that decorator, what I meant is that moving beyond decorators, is there some construct in which while calling a function, say, common_birthday I could set the value for N. So if I am okay with low-confidence answer I set N=100 and if I want answer with high confidence, I could set N=1000000.
      – kamalbanga
      1 hour ago











    • I added 2 approaches to do this dynamically
      – Maarten Fabré
      53 mins ago










    • I am amazed again! I like the second approach so much! Exactly what I wished for, since it can keep a default value; in case I want to set it to some other value, I could do that too. Thanks :)
      – kamalbanga
      43 mins ago












    up vote
    1
    down vote



    accepted







    up vote
    1
    down vote



    accepted






    functools.wraps



    Unrelated to your main question, I advice you to use functools.wraps. This way, your methods metadata get transferred to the wrapped method.
    For example dice.__doc__ returns None without it, but the doctstring if you include it.



    repetitions



    It is possible to programatically change the number of repetitions per function. If you look at functools.lru_cache, where you need to specify the size of the cache too, you can see you just need an extra level of wrapper function:



    def mean_of_experiments(N=100_000):
    def inner_decorator(experiment_func):
    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    @wraps(experiment_func)
    def wrapper(*args, **kwargs):
    # print(f"N repititions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
    return wrapper
    return inner_decorator


    where the inner_decorator is the old mean_of_experiments method



    Dynamically set repetitions



    A decorator is nothing but syntactic sugar for decorator(func), so you can do it by not decorating the functions themselves, but doing the 'decorating' when calling them:



    mean_of_experiments(N=100)(dice(100, 20))


    Dynamically set repetitions 2:



    Another approach is to pass the repetitions on with the kwargs to the experiment_func



    def mean_of_experiments_2(experiment_func):

    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    def wrapper(*args, **kwargs):
    repetitions = kwargs.pop('repetitions', 100_000)
    # print(f"repetitions repetitions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(repetitions)), 3)
    return wrapper

    @mean_of_experiments_2
    def dice(n, c):
    '''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
    return [randint(1, 6) for _ in range(n)].count(6) >= c


    and then call it like this: dice(6, 4, repetitions=100)



    The main caveat here is not to pick repetitions as argument to any of the experiment_funcs



    Further remarks



    _ in ints



    you wrote 100000 in 2 ways: 100000 and 1_00_000. The second way looks incorrect to me (it is syntactically correct, but I would group numbers per 3). The way I would write it is 100_000. You can pick another way, but stay consistent



    common_birthday



    since the rands as you make it will always contain k elements, it would be more efficient to skip this list, and immediately make the set:



    def common_birthday(k):
    '''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
    rands = randint(1, 365) for _ in range(k)
    return len(rands) != k


    dice



    For large n, this can become a very long list. An alternative approach would be to either use a collections.Counter, or use sum



    return sum(randint(1, 6) == 6 for _ in range(n)) >= c


    or



    return Counter(randint(1, 6) for _ in range(n))[6] >= c


    boardings



    If, instead of depending on numpy you want a native implementations, you can use itertools.accumulate and random.expovariate.



    def boardings_native(scale=5.0, N=100_000):
    '''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
    arrivals = list(accumulate(expovariate(lambd=1/scale) for _ in range(N)))

    @mean_of_experiments(3)
    def wait():
    boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
    missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
    return arrivals[boarding_idx] - missed_bus

    return wait()


    If I look at the performance, it is even faster than the numpy implementation, but that is probably because generating the arrivals is an iteration instead of a vectorised operation



    %timeit boardings_native()



    52 ms ± 1.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)



    %timeit boardings()



    143 ms ± 713 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)






    share|improve this answer














    functools.wraps



    Unrelated to your main question, I advice you to use functools.wraps. This way, your methods metadata get transferred to the wrapped method.
    For example dice.__doc__ returns None without it, but the doctstring if you include it.



    repetitions



    It is possible to programatically change the number of repetitions per function. If you look at functools.lru_cache, where you need to specify the size of the cache too, you can see you just need an extra level of wrapper function:



    def mean_of_experiments(N=100_000):
    def inner_decorator(experiment_func):
    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    @wraps(experiment_func)
    def wrapper(*args, **kwargs):
    # print(f"N repititions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(N)), 3)
    return wrapper
    return inner_decorator


    where the inner_decorator is the old mean_of_experiments method



    Dynamically set repetitions



    A decorator is nothing but syntactic sugar for decorator(func), so you can do it by not decorating the functions themselves, but doing the 'decorating' when calling them:



    mean_of_experiments(N=100)(dice(100, 20))


    Dynamically set repetitions 2:



    Another approach is to pass the repetitions on with the kwargs to the experiment_func



    def mean_of_experiments_2(experiment_func):

    '''Decorator to repeat any Bernoulli trial N times and return probability of success'''
    def wrapper(*args, **kwargs):
    repetitions = kwargs.pop('repetitions', 100_000)
    # print(f"repetitions repetitions")
    return round(mean(experiment_func(*args, **kwargs) for _ in range(repetitions)), 3)
    return wrapper

    @mean_of_experiments_2
    def dice(n, c):
    '''Simulates an experiment to roll 'n' dice and and check if count of 6's is at least c'''
    return [randint(1, 6) for _ in range(n)].count(6) >= c


    and then call it like this: dice(6, 4, repetitions=100)



    The main caveat here is not to pick repetitions as argument to any of the experiment_funcs



    Further remarks



    _ in ints



    you wrote 100000 in 2 ways: 100000 and 1_00_000. The second way looks incorrect to me (it is syntactically correct, but I would group numbers per 3). The way I would write it is 100_000. You can pick another way, but stay consistent



    common_birthday



    since the rands as you make it will always contain k elements, it would be more efficient to skip this list, and immediately make the set:



    def common_birthday(k):
    '''Simulates an experiment to generate k independent uniformly random birthdays and check if there are any repeat birthdays'''
    rands = randint(1, 365) for _ in range(k)
    return len(rands) != k


    dice



    For large n, this can become a very long list. An alternative approach would be to either use a collections.Counter, or use sum



    return sum(randint(1, 6) == 6 for _ in range(n)) >= c


    or



    return Counter(randint(1, 6) for _ in range(n))[6] >= c


    boardings



    If, instead of depending on numpy you want a native implementations, you can use itertools.accumulate and random.expovariate.



    def boardings_native(scale=5.0, N=100_000):
    '''Simulates an experiment where arrival of buses at stop follows a Poisson process and finds avg. inter-arrival time at a random instant'''
    arrivals = list(accumulate(expovariate(lambd=1/scale) for _ in range(N)))

    @mean_of_experiments(3)
    def wait():
    boarding_idx = bisect_left(arrivals, uniform(0, arrivals[-1]))
    missed_bus = 0 if boarding_idx == 0 else arrivals[boarding_idx - 1]
    return arrivals[boarding_idx] - missed_bus

    return wait()


    If I look at the performance, it is even faster than the numpy implementation, but that is probably because generating the arrivals is an iteration instead of a vectorised operation



    %timeit boardings_native()



    52 ms ± 1.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)



    %timeit boardings()



    143 ms ± 713 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 53 mins ago

























    answered 1 hour ago









    Maarten Fabré

    3,734215




    3,734215











    • Wow! All these seem great suggestions. Regarding that decorator, what I meant is that moving beyond decorators, is there some construct in which while calling a function, say, common_birthday I could set the value for N. So if I am okay with low-confidence answer I set N=100 and if I want answer with high confidence, I could set N=1000000.
      – kamalbanga
      1 hour ago











    • I added 2 approaches to do this dynamically
      – Maarten Fabré
      53 mins ago










    • I am amazed again! I like the second approach so much! Exactly what I wished for, since it can keep a default value; in case I want to set it to some other value, I could do that too. Thanks :)
      – kamalbanga
      43 mins ago
















    • Wow! All these seem great suggestions. Regarding that decorator, what I meant is that moving beyond decorators, is there some construct in which while calling a function, say, common_birthday I could set the value for N. So if I am okay with low-confidence answer I set N=100 and if I want answer with high confidence, I could set N=1000000.
      – kamalbanga
      1 hour ago











    • I added 2 approaches to do this dynamically
      – Maarten Fabré
      53 mins ago










    • I am amazed again! I like the second approach so much! Exactly what I wished for, since it can keep a default value; in case I want to set it to some other value, I could do that too. Thanks :)
      – kamalbanga
      43 mins ago















    Wow! All these seem great suggestions. Regarding that decorator, what I meant is that moving beyond decorators, is there some construct in which while calling a function, say, common_birthday I could set the value for N. So if I am okay with low-confidence answer I set N=100 and if I want answer with high confidence, I could set N=1000000.
    – kamalbanga
    1 hour ago





    Wow! All these seem great suggestions. Regarding that decorator, what I meant is that moving beyond decorators, is there some construct in which while calling a function, say, common_birthday I could set the value for N. So if I am okay with low-confidence answer I set N=100 and if I want answer with high confidence, I could set N=1000000.
    – kamalbanga
    1 hour ago













    I added 2 approaches to do this dynamically
    – Maarten Fabré
    53 mins ago




    I added 2 approaches to do this dynamically
    – Maarten Fabré
    53 mins ago












    I am amazed again! I like the second approach so much! Exactly what I wished for, since it can keep a default value; in case I want to set it to some other value, I could do that too. Thanks :)
    – kamalbanga
    43 mins ago




    I am amazed again! I like the second approach so much! Exactly what I wished for, since it can keep a default value; in case I want to set it to some other value, I could do that too. Thanks :)
    – kamalbanga
    43 mins ago












    up vote
    0
    down vote













    There is a function choices in random library that returns a list of elements chosen from the population with replacement. It could simplify the methods common_birthday and dice. Though, I believe, this decreases readability by a little.



    from random import choices

    def common_birthday(k):
    return len(set(choices(range(1, 366), k=k))) != k

    def dice(n, c):
    return choices(range(1, 7), k=n).count(6) >= c


    I removed the docstring and decorator for exposition of business logic.






    share|improve this answer








    New contributor




    kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















      up vote
      0
      down vote













      There is a function choices in random library that returns a list of elements chosen from the population with replacement. It could simplify the methods common_birthday and dice. Though, I believe, this decreases readability by a little.



      from random import choices

      def common_birthday(k):
      return len(set(choices(range(1, 366), k=k))) != k

      def dice(n, c):
      return choices(range(1, 7), k=n).count(6) >= c


      I removed the docstring and decorator for exposition of business logic.






      share|improve this answer








      New contributor




      kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.



















        up vote
        0
        down vote










        up vote
        0
        down vote









        There is a function choices in random library that returns a list of elements chosen from the population with replacement. It could simplify the methods common_birthday and dice. Though, I believe, this decreases readability by a little.



        from random import choices

        def common_birthday(k):
        return len(set(choices(range(1, 366), k=k))) != k

        def dice(n, c):
        return choices(range(1, 7), k=n).count(6) >= c


        I removed the docstring and decorator for exposition of business logic.






        share|improve this answer








        New contributor




        kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        There is a function choices in random library that returns a list of elements chosen from the population with replacement. It could simplify the methods common_birthday and dice. Though, I believe, this decreases readability by a little.



        from random import choices

        def common_birthday(k):
        return len(set(choices(range(1, 366), k=k))) != k

        def dice(n, c):
        return choices(range(1, 7), k=n).count(6) >= c


        I removed the docstring and decorator for exposition of business logic.







        share|improve this answer








        New contributor




        kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        share|improve this answer



        share|improve this answer






        New contributor




        kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered 32 mins ago









        kamalbanga

        1184




        1184




        New contributor




        kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        kamalbanga is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.




















            kamalbanga is a new contributor. Be nice, and check out our Code of Conduct.









             

            draft saved


            draft discarded


















            kamalbanga is a new contributor. Be nice, and check out our Code of Conduct.












            kamalbanga is a new contributor. Be nice, and check out our Code of Conduct.











            kamalbanga is a new contributor. Be nice, and check out our Code of Conduct.













             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f204384%2fbasic-probability-simulations-in-python-3%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What does second last employer means? [closed]

            List of Gilmore Girls characters

            Confectionery