15-110 SUMMER SESSION TWO - 2014
Lab 9 - Thursday, July 24
Goals
For this lab you will work with Monte Carlo methods to estimate the answers to questions about a simple problem involving baking cookies. When you are done, you should be able to do the following:
- Use randrange() to randomly generate integers in a given range
- Distinguish between the simulation and estimation parts of a Monte Carlo method
- Use an error range to control the estimate of an expected value by repeated trials
Deliverables
- raisins.py
Place this file in a lab9 folder. Before leaving lab,
zip up the lab9 folder
and hand the zip file in.
The Monte Carlo Method
The Monte Carlo method is a computational technique that works
by calculating a statistical summary of a large number of random
operations. In this lab, we will use the Monte Carlo method to
estimate the outcome of a stochastic process.
Suppose you want to make a batch of 25 raisin cookies. How many
raisins should you use to make sure nearly every cookie has at least 1
raisin? Assume that raisins are randomly distributed in the batter, so
each raisin is equally likely to go into each cookie.
- We can simulate the process of baking M cookies with N raisins.
- Running the simulation once just gives us a True/False (Boolean)
result. Either every cookie after the simulation has a raisin, or not.
- We run the simulation many times to estimate the probability
that every cookie will have at least one raisin.
CA-Led Activities/Demonstration
- Review Monte Carlo methods
- Describe the problem:
Batter for M cookies has N raisins. What is the
probability that all cookies have at least one raisin?
- In raisins.py, write a Python
function make_batch(num_cookies, num_raisins) to simulate
one batch of
cookies.
- Initialize list. How big? What value?
- Distribute raisins randomly (every cookie has an equal
chance of getting each raisin).
- Search the list for a cookie with no raisins and return
False if found.
- Return True if every cookie has a raisin.
Activities
All of your work should go into the file
raisins.py.
PART I. Make a function named make_batch(num_cookies,
num_raisins) that returns True if and
only if
every one of num_cookies cookies has at least 1 raisin. Implement the
function as follows:
- Create a list of num_cookies zeros. This list represents
a set of counters that keep track of the number of raisins in each cookie.
- Distribute the raisins randomly (do this num_raisins times):
- Compute a random cookie index.
- Increment cookies[index] by 1.
- Does every cookie have a raisin?
- Use a linear search on cookies
to see if any element is zero; if one is found, return False.
- If the search completes without finding a 0, return True.
PART II.
Make a function monte_carlo(num_cookies, num_raisins) to estimate the
probability that every cookie will have at least one raisin:
- Set count = 0
- Do this 1000 times:
- Call make_batch(num_cookies, num_raisins)
- If the result is True, increment count by
one.
- After the loop, return the fraction of times every cookie
has at least one raisin: count / 1000.
Call
monte_carlo(25, 100) to estimate the probability that
every cookie in a batch of 25 cookies
will have at least 1 raisin if there are 100 raisins in
all.
Call monte_carlo(25, 100) again. Make sure you
understand why you do not get the same answer.
PART III.
Now we want to answer a different question: how many raisins do
we have to use to get a good probability that every cookie has a
raisin? Write a function raisins_needed(num_cookies, error).
The idea is to start with the same number of raisins as cookies (we
need at least that many), and call monte_carlo to get the
probability that every cookie has a raisin. Keep increasing the
number of raisins until the answer from monte_carlo is
close enough to 1 (certainty). The uncertainty to allow is given
by error. For example, an error of 0.1 means
that we want an answer with certainty of 99%.
The algorithm:
- Set num_raisins = num_cookies
- Set diff to 1 - monte_carlo(num_cookies, num_raisins)
- While diff is greater than error,
increase num_raisins by one and
recalculate diff as above
- Once diff is less than or equal to error,
return num_raisins
Example (note that for large numbers of cookies and/or small error
values this will take a while to run):
>>> raisins_needed(1, .1)
1
>>> raisins_needed(2, .1)
5
>>> raisins_needed(10, .1)
45
Acknowledgement: The raisin cookie problem was suggested by Greg
Kochanski.