Next: Escaping from Local Minima
Up: MICRO-HILLARY
Previous: Selective Experience: Generating Solvable
Selective Attention: Acquiring Macros Leading from Local
Minima to Better States
Given a search trace, every subpath of it can be recorded as a macro.
Obviously, there are too many subpaths, so we need to employ a filter
in order to acquire a set of macros with high utility.
What properties of macros are likely to indicate high utility? There
are three requirements that should be fulfilled:
- .
- The macros should be useful, i.e., they should save the system
significant search resources.
- .
- There should be as few macros as possible. Redundant macros
increase the branching factor of the search graph without
associated benefits.
- .
- The macros should be as short as possible, for two reasons:
- Long macros tend to be less general. The main reason is that
applying a long sequence of basic operators to a state is likely to
encounter an illegal state.
- For the learning method of MICRO-HILLARY, longer macros require
significantly more learning resources (for the escape procedure).
The filtering method we use for MICRO-HILLARY only acquires macros leading
from a local minimum to a better state. Whenever not in a local
minimum, the search procedure can use the heuristic function, so
macros that do not start in a local minimum are not necessary.
The macros acquired by this filtering method
are useful since they enable the search procedure to
continue from a local minimum. These macros are also the shortest that
satisfy that condition.
We call this method a minimum-to-better filtering.
Let h be a well-behaved heuristic function (not necessarily admissible), let
be a training problem, and let
be a
solution to e.
Let
denote the state
resulting from the application of
the sequence
to the initial state.
A subpath
is selected by
the minimum-to-better attention filter if and only if the
following condition holds:
|
(3) |
The first conjunct says that the first state of a macro must be a
local minimum. It serves to satisfy requirement 2 above--that there
should be as few macros as possible. The second conjunct says that the
last state must be a better state. It serves to satisfy requirement 1
above--that macros should be useful. The third conjunct says that
the macro is the shortest satisfying the former two conjuncts. It
serves to satisfy the third requirement above--that macros should be as
short as possible.
When a sufficient number of such macros has been acquired, the search
program
may be able to advance toward the goal state without ever
being trapped in local minima.
Definition 1
Let
S be a set of states and
O a set of operators.
Let
sg be a goal state. Let
h be a
well-behaved
heuristic function.
Let
M be a set of macro-operators.
We say that
M is
complete with respect to
h and
sg
if and only if
A complete macro set ``smooths'' the heuristic
function by eliminating local minima. The length of the individual
macros in a complete set depends on the distance between local minima
and their associated better state.
Definition 2
Let
S be a set of states and
O be a set of operators.
Let
sg be a goal state. Let
h be a
well-behaved
heuristic function. Let
d(
s1,
s2) denote the graph distance
between
s1 and
s2 in the graph defined by
O.
The
radius of a state
with respect to
h,
sg and
O
is defined as:
The radius of the goal state is 0. The radius of a state which is not a
local minimum is 1.
Figure 3:
An illustration of the differences between the minimum-to-minimum
and minimum-to-better filtering methods.
The X axis stands for the sequence of states along the solution path.
The Y axis stands for the heuristic values of states.
(a) A situation where
the minimum-to-better strategy acquires short macros
that are just sufficient to allow progress in the
heuristic search while
the minimum-to-minimum strategy acquires unnecessarily long
macros that are likely to be less applicable.
(b) A situation where
the minimum-to-minimum strategy acquires a macro
that leads from a local minimum to a state with worse
(higher) value.
The minimum-to-better strategy always acquires macros
that lead to a state with a better (lower) value.
|
|
The minimum-to-better filter looks similar to the
minimum-to-minimum4 filtering method used by Iba
[10]. However, there are
several problems with the minimum-to-minimum method that the
minimum-to-better filter avoids.
When the distance between minima is large, the minimum-to-minimum
filter will acquire very long macros. This violates the third
requirement above.
Figure 3(a) illustrates this problem.
The X axis stands for the sequence of states along the solution path.
The Y axis stands for the heuristic values of states.
Long macros such as those acquired by the minimum-to-minimum filter
tend to be much less general (they are more likely to fail).
This problem is intensified with incremental
learning. In the first stages of learning, the minima
are close together and short macros are learned.
These macros, however, make minima more sparse in later stages
of learning, leading to the acquisition of very long macros.
This problem becomes extreme when there is only one minimum in the
solution path. In such a case, the minimum-to-minimum filter will either
learn nothing, or will use Iba's approach and consider the start and
end states of the solution path as minima, leading to the acquisition
of two very long macros.
Indeed, when we tried to replace the
minimum-to-better with the minimum-to-minimum
filter, MICRO-HILLARY went into a very long session of acquiring
very long macros. Iba had to introduce an acquisition filter that
blocks
long macros in order to reduce the harmfulness of this
phenomenon.
Another problem with the minimum-to-minimum method is
that it may learn routes from a minimum to a worse (higher)
minimum. This violates the first requirement above.
Figure 3(b) illustrates this problem.
Of course, it may
then learn another macro from the worse minimum, but the total number of
macros still increases needlessly.
The minimum-to-better strategy always acquires macros
that lead to a state with a better (lower) value.
In Section 6 we discuss further differences
between MACLEARN and MICRO-HILLARY.
The T-Macros acquired by MORRIS [23] also
resemble the macros acquired by MICRO-HILLARY. A T-Macro is ``locally
anomalous. Its initial segment appears to make no progress, but the
sequence as a whole is rated as advantageous''
[23]. It is therefore possible to call the
selection strategy employed by MORRIS any-to-better.
This method, which does not insist on starting a macro from a local
minimum, will acquire more macros than the
minimum-to-better strategy. These macros are not necessary and
violate the second requirement above.
The larger number of (possibly unnecessary)
macros will increase the branching factor and
may deteriorate the performance of the program that uses them.
Figure 4:
An illustration of the difference between the any-to-better
and minimum-to-better selection methods.
The X axis stands for the sequence of states along the solution path.
The Y axis stands for the heuristic value of states.
The any-to-better strategy acquires more (possibly unnecessary) macros.
|
Figure 4 illustrates the
difference between the two strategies. While the minimum-to-better
strategy will acquire one macro that smooths the path, the any-to-better strategy acquires many macros that will unnecessarily
increase the branching factor.
Next: Escaping from Local Minima
Up: MICRO-HILLARY
Previous: Selective Experience: Generating Solvable
Shaul Markovitch
1998-07-21