Let S be a finite set of states. Let O be a finite
set of operators
where each operator
is a function
.
If
,
we say that o(s) is
undefined.
A problem is a pair of states
,
where
is called the initial state and
is called the goal state.2 A solution is a sequence
of operators
,
such
that
.
A problem
is solvable
if there exists a solution for p (specified
by
solvable(si,sg)).
A macro-operator is a sequence of operators
.
A macro-operator m is applied to a state
by applying
its basic
operators in a sequence. If while applying the sequence of operators we encounter
an undefined application, the application of the macro is undefined.
A problem solver
is an algorithm that takes a problem
p
and a set of operators O and applies operators to states, searching
for a solution
to p.
Given a solvable problem p, a problem solver
and a set of operators
O, we define
,
the cost of solving p using
and
O, as the number of operator applications performed by
until a solution is found. Note that SearchCost is different from
the solution cost--the cost of the solution path according to some
cost function. In this work we are only interested in satisficing search--minimizing the
cost of the search regardless of the cost of the resulting solution.
The utility of a set of macros M, with respect to a problem
p,
a problem solver
and a set of operators O, is defined as
Most of the macro-learning systems perform
learning by experimentation [25]. The
program
solves training problems and acquires sequences of operators applied
during the search. Given a set of operators O,
a distribution of problems D,
and a problem solver ,
the goal of a macro-learning
system is to acquire a set
of macro-operators M, such that
is positive,
and as high as possible. It is quite possible, however, that
the utility of M will be negative, as the added macros
also increase the branching factor.
Markovitch and Scott [22] introduced the information filtering framework, which offers a systematic way of dealing with the utility problem by being more selective. The framework identifies five logical types of selection processes in learning systems: selective experience, selective attention, selective acquisition, selective retention and selective utilization. The framework views learning programs as information processing systems where information flows from the experience space through some attention mechanism, through an acquisition procedure to the knowledge base and finally to the problem solver. The five filters are defined with respect to their logical location within the information flow.
![]() |
The general architecture offered by the information-filtering framework, instantiated to off-line macro-learning, is shown in Figure 1. Markovitch and Scott argue that the art of building successful learning systems often lies in selecting the right combination of filters. A careful examination of existing macro-learning systems reveals that those containing sophisticated filtering mechanisms performed the most successful learning.