Metric Assumption 2: Do the Number of Plan Steps Vary?

Next: Conclusions Up: Interpretation of Results and Previous: Metric Assumption 1: Do

Metric Assumption 2: Do the Number of Plan Steps Vary?

They certainly can. If one neglects quality measures, then some planners are being penalized in efforts to declare a best planner.

Recommendation 14: To expedite generalizing across studies, reports should describe performance in terms of what was solved (how many of what types), how much time was required and what were the quality of the solutions. Trade-offs should be reported, when possible, e.g., 12% increase in computation time for 30% decrease in plan length. Additionally, if the design goal was to find an optimal solution, compare to other planners with that as their design goal.

Good metrics of plan quality are sorely needed. The latest specification of the PDDL specification supports the definition of problem-specific metrics [Fox Long 2002]; these metrics indicate whether total-time (a new concept supported by specification of action durations) or specified functions should be minimized or maximized. This addition is an excellent start, but general metrics other than just plan-length and total-time are also needed to expedite comparisons across problems.

Recommendation 15: Developing good metrics is a valuable research contribution. Researchers should consider it a worthwhile project, conference organizers and reviewers should encourage papers on the topic, and planner developers should implement their planners to be responsive to new quality metrics (i.e., support tunable heuristics or evaluation criteria).

Next: Conclusions Up: Interpretation of Results and Previous: Metric Assumption 1: Do