Experimental Analysis

We are faced with three substantially different approaches that are not easy to compare, as their performance will depend on domain features as varied as the structure in the transition model, the type, syntax, and length of the temporal reward formula, the presence of rewards unreachable or irrelevant to the optimal policy, the availability of good heuristics and control-knowledge, etc, and on the interactions between these factors. In this section, we report an experimental investigation into the influence of some of these factors and try to answer the questions raised previously:¹⁰

is the dynamics of the domain the predominant factor affecting performance?
is the type of reward a major factor?
is the syntax used to describe rewards a major factor?
is there an overall best method?
is there an overall worst method?
does the preprocessing phase of PLTLMIN pay, compared to PLTLSIM?
does the simplicity of the FLTL translation compensate for blind-minimality, or does the benefit of true minimality outweigh the cost of PLTLMIN preprocessing?
are the dynamic analyses of rewards in PLTLSTR and FLTL effective?
is one of these analyses more powerful, or are they rather complementary?

In some cases but not all, we were able to identify systematic patterns. The results in this section were obtained using a Pentium4 2.6GHz GNU/Linux 2.4.20 machine with 500MB of ram.

Subsections

Sylvie Thiebaux 2006-01-20