The simplest approach for using planning graphs for belief space
planning heuristics is to use a ``classical'' planning graph. To
form the initial literal layer from the projected belief state, we
could either sample a single state (denoted
) or use an
aggregate state (denoted
). For example, in CBTC (see Figure
5) assuming regression search with
, the
initial level
of the planning graph for
might
be:
= {arm, clog, inP1,
inP2}
and for
it is defined by the aggregate state
:
= {arm, clog, inP1, inP2,
inP1,
inP2}.
Since these two versions of the single planning graph have
identical semantics, aside from the initial literal layer, we
proceed by describing the
graph and point out differences
with
where they arise.
Graph construction is identical to classical planning graphs
(including mutex propagation) and stops when two subsequent literal
layers are identical (level off). We use the planning graph
formalism used in IPP [20] to allow for explicit
representation of conditional effects, meaning there is a literal
layer
, an action layer
, and an effect
layer
in each level
. Persistence for a literal
, denoted by
, is represented as an action where
. A literal is in
if an effect from the previous effect layer
contains the literal in its consequent. An action is in the action
layer
if every one of its execution precondition
literals is in
. An effect is in the effect layer
if its associated action is in the action layer
and every one of its antecedent literals is in
.
Using conditional effects in the planning graph avoids factoring an
action with conditional effects into a possibly exponential number
of non-conditional actions, but adds an extra planning graph layer
per level. Once our graph is built, we can extract heuristics.
No Aggregation: Relaxed plans within a single planning graph
are able to measure, under the most optimistic assumptions, the
distance between two belief states. The relaxed plan represents a
distance between a subset of the initial layer literals and the
literals in a constituent of our belief state. In the
, the
literals from the initial layer that are used for support may not
hold in a single state of the projected belief state, unlike the
. The classical relaxed plan heuristic
finds a
set of (possibly interfering) actions to support the goal
constituent. The relaxed plan
is a subgraph of the planning
graph, of the form {
,
,
, ...,
,
,
}. Each of the layers contains a
subset of the vertices in the corresponding layer of the planning
graph.
More formally, we find the relaxed plan to support the constituent
that is reached earliest in the graph
(as found by the
heuristic in Appendix A).
Briefly,
returns the first level
where a
constituent of
has all its literals in
and none
are marked pair-wise mutex. Notice that this is how we incorporate
negative interactions into our heuristics. We start extraction at
the level
, by defining
as the literals in
the constituent used in the level heuristic. For each literal
, we select a supporting effect (ignoring mutexes)
from
to form the subset
. We
prefer persistence of literals to effects in supporting literals.
Once a supporting set of effects is found, we create
as all actions with an effect in
. Then the needed preconditions for the actions and
antecedents for chosen effects in
and
are added to the list of literals to support from
. The algorithm repeats until we find the
needed actions from
. A relaxed plan's value is the
summation of the number of actions in each action layer. A literal
persistence, denoted by a subscript ``p'', is treated as an action
in the planning graph, but in a relaxed plan we do not include it in
the final computation of
. The single
graph relaxed plan heuristic is computed as
For the CBTC problem we find a relaxed plan from the
, as
shown in Figure 5 as the bold edges and nodes. Since
arm and
clog are non mutex at level two, we can use
persistence to support
clog and DunkP1 to support
arm in
. In
we can use persistence
for inP1, and Flush for
clog. Thus,
= 2
because the relaxed plan is:
inP1
, Flush
,
The relaxed plan does not use both DunkP2 and DunkP1 to support
arm. As a result
arm is not supported in all worlds
(i.e. it is not supported when the state where inP2 holds is our
initial state). Our initial literal layer threw away knowledge of
inP1 and inP2 holding in different worlds, and the relaxed plan
extraction ignored the fact that
arm needs to be supported in
all worlds. Even with an
graph, we see similar behavior
because we are reasoning with only a single world. A single,
unmodified classical planning graph cannot capture support from all
possible worlds - hence there is no explicit aggregation over
distance measures for states. As a result, we do not mention
aggregating states to measure positive interaction, independence, or
overlap.