Introduction

During the first half of the last decade, much of the research in domain independent plan generation focused on partial order causal link (POCL) planners. The two dominant POCL planners were SNLP (McAllester & Rosenblitt, 1991) and UCPOP (Penberthy & Weld, 1992), and a large part of the planning research was aimed at scaling up these two planners. The most promising attempts at making POCL planning practical involved alternative flaw selection strategies (Peot & Smith, 1993; Joslin & Pollack, 1994; Schubert & Gerevini 1995; Williamson & Hanks, 1996; Pollack, Joslin, & Paolucci, 1997). A flaw in POCL planning is either an unlinked precondition (called open condition) for an action, or a threatened causal link. While flaw selection is not a backtracking point in the search through plan space for a complete plan, the order in which flaws are resolved can have a dramatic effect on the number of plans searched before a solution is found. The role of flaw selection in POCL planning is similar to the role of variable selection in constraint programming.

There have been dramatic advances in domain independent planning in the past seven years, but the focus has shifted from POCL planning to CSP-based planning algorithms (Blum & Furst, 1997; Kautz & Selman, 1996) and state space planning as heuristic search (Bonet & Geffner, 2001b; Hoffmann & Nebel, 2001). Recently, Nguyen and Kambhampati (2001) showed that with techniques such as distance based heuristics and reachability analysis--largely responsible for the efficiency of today's best domain independent planners--can also be used to dramatically improve the efficiency of POCL planners, thereby initiating a revival of this previously popular approach to domain independent planning. We have drawn from their experience, as well as from experience with flaw selection strategies from the glory-days of POCL planning, when developing the Versatile Heuristic Partial Order Planner (VHPOP), and the result is a POCL planner that was able to compete well with CSP-based and heuristic state space planners at the 3rd International Planning Competition (IPC3).

We have previously (Younes & Simmons, 2002) adapted the additive heuristic--proposed by Bonet, Loerincs, and Geffner (1997) and used in HSP (Bonet & Geffner, 2001b)--for plan space search. In this paper we present a variation of the additive heuristic for POCL planning that accounts for possible reuse of actions that are already part of a plan. We show that this accounting for positive interaction often results in a more effective plan ranking heuristic. We also present ablation studies that demonstrate the effectiveness of a tie-breaking heuristic based on estimated planning effort (defined as the total number of open conditions, current and future, that need to be resolved in order to complete a partial plan). The results show that using this tie-breaking heuristic almost always improves planner performance.

While the heuristics implemented in VHPOP can work with either ground (fully instantiated) or lifted (partially instantiated) actions, we chose to work only with ground actions at IPC3. We have shown elsewhere (Younes & Simmons, 2002) that planning with lifted actions can help reduce the branching factor of the search space compared to using ground actions, and that this reduction sometimes is large enough to compensate for the added complexity that comes with having to keep track of variable bindings. Further studies are needed, however, to gain a better understanding of the circumstances under which planning with lifted actions is beneficial.

VHPOP efficiently implements all the common flaw selection strategies, such as DUnf and DSep (Peot & Smith, 1993), LCFR (Joslin & Pollack, 1994), and ZLIFO (Schubert & Gerevini, 1995). In addition to these, we introduce numerous novel flaw selection strategies in this paper, of which four were used at IPC3. While we do not claim to have resolved the issue of global versus local flaw selection--manifested by the conflicting claims made by Gerevini and Schubert (1996) on the one hand, and Pollack et al. (1997) on the other about the most efficient way to reduce the number of searched plans in POCL planning--we show that by combining ideas from both ZLIFO and LCFR we can get very efficient flaw selection strategies. Other novel flaw selection strategies introduced in this paper are based on heuristic cost, an idea previously explored by Ghallab and Laruelle (1994). We also introduce ``conflict-driven'' flaw selection strategies that aim to expose possible inconsistencies early in the search, and we show that strategies based on this idea can be effective in domains previously thought to be particularly difficult for POCL planners.

Ideally, we would like to have one single flaw selection strategy that dominates all other strategies in terms of number of solved problems. We have yet to discover such a universal strategy, so instead we use a technique previously explored by Howe, Dahlman, Hansen, Scheetz, and von Mayrhauser (1999) for combining the strengths of different planning algorithms. The idea is to run several planners concurrently, and Howe et al. showed that by doing so more problems can be solved than by running any single planner. In VHPOP we use the same basic POCL planning algorithm in all instances, but we use different flaw selection strategies concurrently.

VHPOP extends the capabilities of classical POCL planners by also supporting planning with durative actions. This is accomplished by adding a simple temporal network (STN) (Dechter, Meiri, & Pearl, 1991) to the regular plan representation of a POCL planner. The STN records temporal constraints between actions in a plan, and supersedes the simple ordering constraints usually recorded by POCL planners. The use of STNs permits actions with interval constraints on the duration (a feature that was not utilized by any of the domains at IPC3 that VHPOP could handle). The approach we take to temporal POCL planning is essentially the same as the constraint-based interval approach described by Smith, Frank, and Jónsson (2000), and similar techniques for handling durative actions in a POCL framework can be traced back at least to Vere's DEVISER (Vere, 1983). Our contribution to temporal POCL planning is demonstrating that the same heuristic techniques shown to boost the performance of classical POCL planning can also be effective in domains with durative actions, validating the feasibility of the POCL paradigm for temporal planning on a larger set of benchmark problems than has been done before.

Håkan L. S. Younes
2003-08-26