RECITATION NOTES	   15-451 Algorithms		      01/31/07

Some problems for recitation 
* problem related to Thurs lecture last week.
* practice with lower bounds.
* more practice with analyzing probabilistic processes
  - coupon collectors problem
  - coin flipping
=======================================================================
Problem related to Thurs lecture:
We saw in lecture by a "stack of bricks" argument that a recurrence of
the form: 

  T(n) = n + T(a1 * n) + T(a2 * n) + ... + T(ak * n)

(for constants a1,...,ak) solves to O(n) so long as a1 + ... + ak < 1.

What about a recurrence like:

  T(n) = n^2 + T(a1 * n) + T(a2 * n) + ... + T(ak * n)
or
  T(n) = n^b + T(a1 * n) + T(a2 * n) + ... + T(ak * n)

What conditions on the ai do you need so that this gives O(n^b)?

[Answer we just need a1^b + ... + ak^b < 1.  Let's define "r" to be
the quantity on the LHS.  In this case, each time you move down one
level of bricks, each brick gets replaced by something r times its
size.  So, the sum of sizes of the bricks at level j is r times the
sum at level j-1.  This means the total work down is a decreasing
geometric series.]

============================================================================
Practice with lower bounds: lower bounds for quicksort variants

Last week we looked at 2 versions of quicksort: (1) pick leftmost
element as pivot, and (2) pick *random* element as pivot.  We saw
there were inputs that could cause 1st method to take n^2 time, but
proved that for *any* input, expected time of 2nd method is only O(n
log n).  But might ask: what about a more sophisticated method of
picking a pivot?  E.g., unix qsort uses "median of 3" rule: look at
leftmost, rightmost, and middle element and take their median, then use
that as the pivot.  Does that get rid of the Omega(n^2) worst case?

Claim: Any deterministic algorithm that selects pivots by examining a
constant c number of elements and then choosing one of them as pivot will
still result in quicksort taking Omega(n^2) comparisons in the worst-case.

Proof: Given some algorithm ALG that chooses pivots by examining <= c
elements, we will show that for any n, we (the adversary) can
construct an ordering of {1,...,n} that causes ALG to make n^2/c'
comparisons, for some constant c'.

Here is how we do it.  Think of the array A as n boxes.  We will
now simulate ALG on A, filling in the boxes as we go.

Set counter i=1.  When ALG examines a box, put i in there, and then
increment i.  Eventually, after examining <= c elements, ALG picks one
as a pivot and splits.  Key invariant: all empty boxes will have
elements >= the current value of i.  So, all empty boxes go
into the GREATER pile.  Notice that since our procedure only
increments i, this invariant is actually correct, and this 
holds true at every partition step.  Since there are at least n-kc
empty boxes after kth partition step, the total number of comparisons
is at least: 

  (n-c) + (n-2c) + (n-3c) + ... 
  which is > n^2/c' for some constant c'.    QED

One note: is this legal?  I mean we came up with elements as we went
along.  But the point is that since the alg is deterministic, it
will behave exactly the same way if you run it again on whatever
inputs we ended up putting in.  (It was important that our decisions
were self-consistent though.)

=======================================================================

Practice with probabilistic analysis.

The coupon-collector's problem is a nice one for probabilistic
analysis using the tools we've discussed so far.

    The problem is: suppose we have n boxes (let's label them 1,...,n)
    and we repeatedly throw balls into the boxes at random.  What is the
    expected number of throws until every box has at least one ball in it?

    Answer: nH_n

    (where H_n = 1+1/2+1/3+...+1/n)

The proof is as follows.  Imagine a counter (starting at 0) that tells
us how many boxes have at least one ball in it.  Let X_1 denote the
number of throws until the counter reaches 1 (so X_1 = 1).  Let X_2
denote the number of throws from that point until the counter reaches
2.  In general, let X_k denote the number of throws made from the time
the counter hit k-1  up until the counter reaches k.

So, the total number of throws is X_1 + ... + X_n, and by linearity of
expectation, what we are looking for is E[X_1] + ... + E[X_n].

How to evaluate E[X_k]?  Suppose the counter is currently at k-1.
Each time we throw a ball, the probability it is something new is
(n-(k-1))/n.  So, another way to think about this question is as follows:  

  Coin flipping: we have a coin that has probability p of coming up
  heads (in our case, p = (n-(k-1))/n).  What is the expected number
  of flips until we get a heads?  

  It turns out that the "intuitively obvious answer", 1/p, is correct.
  But why?  Here is one way to see it: if the first flip is heads,
  then we are done; if not, then we are back where we started, except
  we've already paid for one flip.  So the expected number of flips E
  satisfies: E = p*1 + (1-p)*(1 + E).  You can then solve for E = 1/p.

  The above argument is a little tricky, so if you want, you can draw
  a picture of a tree like this, where what we are looking for is the
  sum, over all leaves, of the depth of the leaf times the probability
  of reaching that leaf.
                                   *
                                 p/ \(1-p)
                                 *   *
                                   p/ \(1-p)
                                   *   *
                                        ...

  And if we let T denote that tree, the argument is saying just that one
  simple way of analyzing T is to view T as:
                                   *
                                 p/ \(1-p)
                                 *   T


Putting this all together, let CC(n) be the expected number of
throws until we have filled all the boxes.    We then have:

   CC(n) = E[X_1] + ... + E[X_n]

         = n/n + n/(n-1) + n/(n-2) + ... + n/1

         = n(1/n + 1/(n-1) + ... + 1/1) 

         = n H_n.

    QED.