Next: Learning to Solve Problems
Up: Experimental Methodology
Previous: Dependent Variables
Independent Variables
There are many independent variables that affect the performance of
the learning system. For the experiments described here we look at
the following variables:
- Selection method:
- Minimum-to-better: The main selection method defined by
Equation 3.
- Minimum-to-minimum: Our implementation of MACLEARN's
selection method.
- Any-to-better: Our implementation of MORRIS's
selection method.
- Heuristic function:
- RR: The heuristic function described above.
- Row-by-row-2 : The RR heuristic without the third
component.
- Reduction : A variation of the RR heuristic
that leads
to placing of tiles in the first row, then the last column, then the
second
row, then the second to last column etc. This heuristic was
suggested by Korf [16].
- Spiral: A variation of the RR heuristic
that leads
to placing the tiles in a spiral order from outside in.
- Sum-of-Manhattan-distances : The sum of the Manhattan
distances between each tile and its destination (will be denoted by
MD).
- Domain: Most of the experiments described here were
performed in
the 15-puzzle domain. However, we have tried MICRO-HILLARY in several
other domains. In all the domains, the initial states are generated
using the domain-independent training-problem generator which
applies a random sequence of operators to the goal state.
- 24-puzzle: 5 x 5 sliding tile puzzle.
- 10-cannibals: The cannibals and missionaries problem
[28]. To make
it non-trivial, we use 10 cannibals and 10 missionaries
instead of the usual 3 and 3. 10 cannibals and 10 missionaries
are situated on the two banks of a river. They
must all be moved onto one target bank. They can use
a boat which can ship one or two persons. The cannibals should never
outnumber the missionaries on either bank. The heuristic function
used is the number of persons that are not yet located on the target bank.
- 10-stones: A puzzle that appeared in Nilsson's book
[28].
Instead of using 3 black stones and 3 white stones we use 5 black
stones and 5 white stones. The goal configuration is
.
A stone may move into an adjacent empty cell or may hop over one or
two other stones into an empty cell.
The heuristic function used is
the number of stones not yet in their goal position.
- 5-Hanoi: The Towers of Hanoi problem with 5 rings
in increasing sizes. There
are 3 pegs and 5 rings. The goal position has the 5 rings placed
on the left peg. A ring may be moved to the top of another peg provided that
it is not placed on a smaller ring as a result. The heuristic function used is
the number of rings not yet placed on their target peg.
- Grid: A grid of 50 x 50 with random parallel walls
inserted to make the Manhattan-distance heuristic inaccurate.
There are four basic operators (North, South, West and East). A goal
state can be any location on the grid. The objective is to find a
path from the initial location to the goal location. Figure
7 shows the grid used.
The emphasis of this paper is on building a general learning algorithm
that is able to solve the NxN puzzle problem. The goal of testing
the algorithm on these other domains was to show that the learning
algorithm is indeed general and does not include domain-specific
procedures.
Figure 7:
The grid domain
|
Next: Learning to Solve Problems
Up: Experimental Methodology
Previous: Dependent Variables
Shaul Markovitch
1998-07-21