next up previous
Next: Planner Assumption 2: What Up: Interpretation of Results and Previous: Problem Assumption 3: Does

Planner Assumption 1: Is the Latest Version the Best?

Our results suggest that new versions run faster, but often do not solve more problems. Thus, the newest version may not represent the ``best'' (depending on your definition) performance for the class of planner. Some competitions in other fields, e.g., the automatic theorem proving community, require the previous year's best performer to compete as well; this has the advantage of establishing a baseline of performance as well as allowing a comparison to how the focus may shift over time.
Recommendation 10: If the primary evaluation metric is speed, then a newer version may be the best competition. If it is number of problems solved or if one wishes to establish what progress has been made, then it may be worth running against an older version as well. If recommendation 9 has been followed, then evaluators should select a version based on this guidance.


©2002 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.