The additive heuristic does not take reuse of actions (other than the dummy action a0) into account, so it often overestimates the actual number of actions needed to complete a plan. The need to take positive interaction into account in order to obtain a more accurate heuristic estimate has been recognized in both state space planning (Nguyen & Kambhampati, 2000; Hoffmann & Nebel, 2001; Refanidis & Vlahavas, 2001) and plan space planning (Nguyen & Kambhampati, 2001). For IPC3 we used a slight modification of the additive heuristic to address the issue of action reuse:
To illustrate the difference between
hadd() and
haddr(
) consider a planning domain with two
action schemas A1 and A2, where A1 has no preconditions and
A2 has a single precondition q. Assume that q can only be
achieved through an action instance of A1. The heuristic cost for
the literal q is therefore 1 according to the additive heuristic.
Consider now a plan
with two unordered actions a1 and a2
(ai being an instance of action schema Ai) and a single open
condition
a2. We have
hadd(
) = hadd(q) = 1 corresponding to the addition of a new instance
of action schema A1 to achieve q, but
haddr(
) = 0 because there is an action (viz. a1) that is not ordered after a2 and has an effect that unifies
with q. Table 1 shows that taking reuse into
account can have a significant impact on planning time in practice.
The modified additive heuristic
haddr clearly
dominates
hadd in the DriverLog and ZenoTravel domains
despite incurring a higher overhead per generated plan. The results
in the Satellite domain are more mixed, with
hadd having a
slight edge overall. We show planning times for the four flaw
selection strategies that were used by VHPOP at IPC3. These
and other novel flaw selection strategies are discussed in detail in
Section 3.2.2.
Hoffmann and Nebel (2001) describe the FF heuristic that takes positive interaction between actions into account by extracting a plan from the relaxed planning graph1, and argue that the accounting of action reuse is one of the contributing factors to FF's performance advantage over HSP. The FF heuristic can take reuse of potential actions into account, and not just existing actions as is the case with our modified additive heuristic. This should result in a better estimate of actual plan cost, but requires that a plan is extracted from the relaxed planning graph for every search node, which could be costly. It would be interesting to see how the FF heuristic performs if used in a plan space planner.
The heuristic cost function used in REPOP (Nguyen & Kambhampati, 2001), a heuristic partial order planner working solely with ground actions, is defined using a serial planning graph.2 The heuristic is similar in spirit to the FF heuristic, and can like the FF heuristic take reuse of potential actions into account. The REPOP heuristic also takes into account reuse of existing actions, but seemingly without considering ordering constraints, which is something we do in our modified additive heuristic. Furthermore, our haddr heuristic always takes reuse of any existing actions that achieves a literal q into account, while the REPOP heuristic only considers an existing action if it happens to be selected from the serial planning graph as the the action that achieves q. The results in Table 2 indicate that the REPOP heuristic may be less effective than the additive heuristic (with and without reuse) in certain domains.
Håkan L. S. Younes