PLAYING GAMES
-------------

Game-playing is one of AI's greatest successes. In Assignment 5 we
study game-playing techniques through Connect-4.

Classical game-playing techniques work under the following two
assumptions:

   * alternating-play two-person
   * no hidden information
   * each turn there are a finite number of moves

Here are some good examples.

  tic-tac-toe        Othello
  checkers           chess
  Go                 Connect-4 (!)

Here are some bad examples.

  poker              Stratego
  bridge             Scrabble
  Boggle

Explain the rules of Connect-4 (Section 7.2 of the assignment
handout), and play a game against the class on a 3x3 board where you
want 3-in-a-row to win.

Game evaluation
---------------

Back up about four moves and ask: How do you know to make that move?

One way to do it is to represent a game as a game tree. We'll just
build up all the possible ``futures'' from the current board state.

        -X-
        OX-
      __XOO__
     /       \
    XX-      -X-
    OX-      OXX
    XOO      XOO
     |      /   \
    XX-   OX-   -XO
    OXO   OXX   OXX
    XOO   XOO   XOO
     |     |     |
    XXX   OXX   XXO
    OXO   OXX   OXX
    XOO   XOO   XOO

Assign 1 to all wins for X, -1 for all wins to O, and 0 to all cat's
games.

Now to find the value of an internal node, we consider its children's
values. If it's X's turn, X would choose the maximum of its children,
since it wants to win, or tie if that's not possible. Contrariwise, O
will choose the minimum.

        -X-
        OX-
       _XOO_
      /  1  \
     /       \
    XX-      -X-
    OX-      OXX
    XOO      XOO
     1      / 0 \
     |     /     \
    XX-   OX-   -XO
    OXO   OXX   OXX
    XOO   XOO   XOO
     1     1     0 
     |     |     |
    XXX   OXX   XXO
    OXO   OXX   OXX
    XOO   XOO   XOO
     1     1     0 

So X should choose to go in the first column in order to guarantee a
win. If X went in the third column instead, O would go in the third
column to force a tie.

Evaluating for all nodes is called minimax search.

We can implement this recursively without actually constructing the
tree. In pseudocode (which you will implement in Assignment 5), we can
perform this computation as follows:

OPTIMAL_PLAY(board, player)
  // base case
  if board is a win for X      then return 1
  else if board is a tie       then return 0
  else if board is a win for O then return -1

  if player = X then best_value = -infty else best_value = infty fi
  for each legal move on board do
    try move
    value <- OPTIMAL_PLAY(new board, other player)
    undo move

    if player = X then best_value <- max(value, best_value)
    else               best_value <- min(value, best_value)
    fi
  od
  return best_value

Note how we might recompute the value for the same board several times.
This motivates memoization, which you will also do in the assignment.

Heuristics
----------

The problem with this is that the game tree grows exponentially as the
problem size grows.  (This is addressed in task 1 of the assignment.)
We'd still like to move in a few seconds, though. How can we move
faster, possibly losing some of the earlier rigor?

We use a heuristic function to estimate ``goodness'' of a board for
X.  Our strategy is to go down to a particular depth, evaluate a
heuristic function, and apply minimax search to tree.

Here's one possibility: we'll give ourself a point for each row,
column, or diagonal where X has two pieces and O has none, and we'll
remove a point for each row, column, or diagonal where O has two
pieces and X has none. If the board is in a final state, we'll
evaluate it at 10 if X has won, -10 if O has won, or 0 if it is a
tie.

It's X's move; if we only have time to go down two levels, it will look
like the following.

                ---
                -X-
       _________XOO_________
      /          1          \
     /           |           \
    ---         -X-         ---
    XX-         -X-         -XX
   _XOO_        XOO        _XOO_
  /  1  \      / 0 \      /  1  \
 /   |   \     |   |     /   |   \
O-- -O- ---   -X- -X-   --- -O- --O
XX- XX- XXO   OX- -XO   OXX -XX -XX
XOO XOO XOO   XOO XOO   XOO XOO XOO
 2   2   1     1   0     1   2   1

This is basically how real game-playing programs work. (They perform
some optimizations to trim the game tree more, but this is the basic
idea.)