15-110 SUMMER SESSION TWO - 2014

Lab 9 - Thursday, July 24

Goals

For this lab you will work with Monte Carlo methods to estimate the answers to questions about a simple problem involving baking cookies. When you are done, you should be able to do the following:

Use randrange() to randomly generate integers in a given range
Distinguish between the simulation and estimation parts of a Monte Carlo method
Use an error range to control the estimate of an expected value by repeated trials

Deliverables

raisins.py

Place this file in a lab9 folder. Before leaving lab, zip up the lab9 folder and hand the zip file in.

The Monte Carlo Method

The Monte Carlo method is a computational technique that works by calculating a statistical summary of a large number of random operations. In this lab, we will use the Monte Carlo method to estimate the outcome of a stochastic process.

Suppose you want to make a batch of 25 raisin cookies. How many raisins should you use to make sure nearly every cookie has at least 1 raisin? Assume that raisins are randomly distributed in the batter, so each raisin is equally likely to go into each cookie.

We can simulate the process of baking M cookies with N raisins.
Running the simulation once just gives us a True/False (Boolean) result. Either every cookie after the simulation has a raisin, or not.
We run the simulation many times to estimate the probability that every cookie will have at least one raisin.

CA-Led Activities/Demonstration

Review Monte Carlo methods
Describe the problem: Batter for M cookies has N raisins. What is the probability that all cookies have at least one raisin?
In raisins.py, write a Python function make_batch(num_cookies, num_raisins) to simulate one batch of cookies.
- Initialize list. How big? What value?
- Distribute raisins randomly (every cookie has an equal chance of getting each raisin).
- Search the list for a cookie with no raisins and return False if found.
- Return True if every cookie has a raisin.

Activities

All of your work should go into the file raisins.py.

PART I. Make a function named make_batch(num_cookies, num_raisins) that returns True if and only if every one of num_cookies cookies has at least 1 raisin. Implement the function as follows:

Create a list of num_cookies zeros. This list represents a set of counters that keep track of the number of raisins in each cookie.
Distribute the raisins randomly (do this num_raisins times):
- Compute a random cookie index.
- Increment cookies[index] by 1.
Does every cookie have a raisin?
- Use a linear search on cookies to see if any element is zero; if one is found, return False.
- If the search completes without finding a 0, return True.

PART II. Make a function monte_carlo(num_cookies, num_raisins) to estimate the probability that every cookie will have at least one raisin:

Set count = 0
Do this 1000 times:
- Call make_batch(num_cookies, num_raisins)
- If the result is True, increment count by one.
After the loop, return the fraction of times every cookie has at least one raisin: count / 1000.

Call monte_carlo(25, 100) to estimate the probability that every cookie in a batch of 25 cookies will have at least 1 raisin if there are 100 raisins in all.

Call monte_carlo(25, 100) again. Make sure you understand why you do not get the same answer.

PART III. Now we want to answer a different question: how many raisins do we have to use to get a good probability that every cookie has a raisin? Write a function raisins_needed(num_cookies, error). The idea is to start with the same number of raisins as cookies (we need at least that many), and call monte_carlo to get the probability that every cookie has a raisin. Keep increasing the number of raisins until the answer from monte_carlo is close enough to 1 (certainty). The uncertainty to allow is given by error. For example, an error of 0.1 means that we want an answer with certainty of 99%.

The algorithm:

Set num_raisins = num_cookies
Set diff to 1 - monte_carlo(num_cookies, num_raisins)
While diff is greater than error, increase num_raisins by one and recalculate diff as above
Once diff is less than or equal to error, return num_raisins

Example (note that for large numbers of cookies and/or small error values this will take a while to run):

>>> raisins_needed(1, .1)
1
>>> raisins_needed(2, .1)
5
>>> raisins_needed(10, .1)
45

Acknowledgement: The raisin cookie problem was suggested by Greg Kochanski.