RECITATION NOTES	   15-451 Algorithms		      02/04/09

- hand back hwk
- problem from old final exam
- whatever else you want to do
=======================================================================
1. hand back hwk, go over questions people have.

2. Here is a problem from an old final exam, related to the
recent material.  [reworded a bit for this recitation notes]

Suppose we want to design a binary search tree over {1,...,n} to
minimize the worst-case time to look up any element.  Then clearly a
balanced tree is best.  But, what if we are allowed to use a
randomized construction (equivalently, a probability distribution over
search trees).  Perhaps we can do better in terms of minimizing the
worst-case *expected* time of any lookup.

In this problem, we consider the case of n=3.  It turns out
that over {1,2,3}, there are only three interesting trees to consider:
(a) the balanced tree with 2 at the root, (b) the zig-zag tree with 1
at the root, then 3, then 2, and (c) the zag-zig tree with 3, then 1,
then 2.

[it's not totally obvious that you don't need to use the other trees
like 1-2-3 or 3-2-1, but you don't]

We can think of this as a matrix game between the "algorithm player"
and the "adversary input player".  The algorithm picks a tree, the
adversary picks an input, and the algorithm-player then pays the 
adversary an amount equal to the depth of the item in the tree.

a) Fill in the entries for a 3x3 matrix game corresponding to this
problem.  We have one row for each tree, one column for each possible
element to be looked up, and the cells in the matrix should indicate
the cost to perform the given access in the given tree (assume the
cost of an access is the depth of the element in the tree, with the
root having depth 1).

b) Before getting to the main question, what are tight *deterministic*
   upper and lower bounds?  (I.e., what's the best tree in terms of
   worst-case depth?)

   Ans: 2.  The balanced tree has depth at most 2 and there's no tree
	    with depth 1.  From the point of view of the matrix, we
	    are saying that there is some row in which all entries are
	    at most 2 (upper bound of 2), and for every row there is some
	    entry that is at least 2 (lower bound of 2).

c) Now for the main question: what are tight *randomized* upper and
   lower bounds?  In particular, a the value of a randomized strategy
   for the row player gives us an upper bound, and the value of a
   randomized strategy for the column player gives us a lower bound.  The
   best possible upper bound is the (value of the) minimax optimal
   strategy for the row player, and the best possible lower bound is
   the (value of the) minimax optimal strategy for the column player.
   These will turn out to match.

Let's solve for the row player, and you can use the hint that it is
possible to argue by symmetry that the probability on the zag-zig tree
should equal the probability on the zig-zag tree.

Let p = probability on zig-zag = prob on zag-zig.  So the balanced
tree has probability 1-2p.

We want to minimize the worst-case (maximum) of
  cost of looking up 1:  2(1-2p) + p + 2p
  cost of looking up 2:  (1-2p) + 3p + 3p
  cost of looking up 3:  2(1-2p) + 2p + p.

The first and last of these are identical, so we can simplify this to:
we want to minimize the maximum of:

                  2 - p
and
                  1 + 4p.

Notice that as functions of p, one increases with p and one decreases
with p (and one is larger when p=0 and the other is larger when p=1).
So, the maximum will be smallest when both are equal.  I.e.,

             2 - p = 1 + 4p
so
             p = 1/5.

So, the minimax optimal strategy is to use the balanced tree with
probability 3/5, and each other tree with probability 1/5.  The
*value* of this strategy is 9/5.  So, it's a little better than just
using the balanced tree.

Actually, our argument shows that this is optimal, so this implies
this is also a lower bound (otherwise it wouldn't be optimal!)
but if we want we can also explicitly give a minimax-optimal strategy
for the column player.  In other words, what distribution on inputs is
the worst one for any tree algorithm?

In this case, it turns out we can assume the probability on 1 and 3
are equal.  So, let's call it q, and have the probability on 2 be
1-2q.  We now want to maximize the minimum of:
 average cost on balanced tree: 4q + (1-2q)
 average cost on zig-zag tree:  3q + 3(1-2q)
 average cost on zag-zig tree:  3q + 3(1-2q)

So, we want to maximize the minimum of 1 + 2q and 3 - 3q.
Again, the minimum is maximized when both are equal, so 5q=2, q=2/5.
So, the worst distribution for the inputs is 2/5 prob on 1, 2/5 prob
on 3, and 1/5 prob on 2.