Next: Planner Assumption 1: Is
Up: Interpretation of Results and
Previous: Problem Assumption 2: How
The planners did not perform quite as
advertised or expected given some problem features. This discrepancy
could have many possible causes: problems incorrectly specified,
planners with less sensitivity than thought, solutions not being
correct, etc. For example, many of the problems in the benchmark set
were not designed for the competitions or even intended to be widely
used and so may not have been specified carefully enough.
Recommendation 8: When problems are contributed to the benchmark set,
developers should verify that the requirements stated in the description of
each problem correctly reflect the subset of features needed. Planner
evaluators should then use only those problems that match a planner's
capabilities.
Depending on the cause, the results can be skewed, e.g., a planner may
be unfairly maligned for being unable to solve a problem that it was
specifically designed not to solve. The above recommendation
addresses gaps in the specification of the problem set, but some mismatches between
the capabilities specifiable in PDDL and those that planners possess remain.
Recommendation 9: Planner developers should develop a
vocabulary for their planner's capabilities, as in the PDDL flags, and
specify the expected capabilities in the planner's distribution.
Next: Planner Assumption 1: Is
Up: Interpretation of Results and
Previous: Problem Assumption 2: How
©2002 AI Access Foundation and Morgan Kaufmann
Publishers. All rights reserved.