Marvin's underlying search algorithm is based on that used by FF: forward-chaining heuristic search using the RPG heuristic. However, Marvin includes some important modifications to the basic FF algorithm. These are: a least-bad-first search strategy for exploring plateaux, a greedy best-first strategy for searching when EHC fails and the development and use of plateau-escaping macro-actions.
As in FF the first approach to finding a solution plan is to perform EHC search using only helpful actions. The first successor with a heuristic strictly better than the best so far is taken, should one be found. If one is not found, then a plateau has been encountered, and a form of best-first search using helpful actions is used (instead of the breadth-first search of FF) to try to find an action sequence to escape from it. Because the states on a plateau can never improve on the heuristic value of the node at the root of the plateau, we call this least-bad-first search.
If the EHC approach is unable to find a plan, Marvin resorts to a modified form of best-first search using all the actions applicable in each state. This expands the first strictly better successor whilst keeping the current state for further expansion later if necessary. We call this strategy greedy best-first search. As can be seen in the graphs in Section 6.2, in some of the IPC 4 domains our modifications enable the solution of problems that are unsolvable for best-first search.
During the EHC search strategy, Marvin uses plateau-escaping macro-actions learned from previous searches of similar plateaux. These can be applied in the same way as atomic actions to traverse plateaux in a single step. Plateau-escaping macro-actions are learned online and the planner must decide which ones are likely to be applicable at which points during search. In Section 6.5 we show that plateau-escaping actions can yield performance benefits. Their power depends on the structure of the search space and the ability of the planner to learn re-usable macro-actions.
Least-bad-first search on plateaux, greedy best-first search and plateau-escaping macro-actions are the three main features of Marvin distinguishing its basic search strategy from that of other forward heuristic search-based planners. We now discuss these three features in more detail before going on to describe how they can be exploited in the context of ADL domains and domains involving derived predicates.