In our experiments with artificial domains, we found PLTLSTR and
FLTL preferable to state-based PLTL approaches in most cases. If
one insists on using the latter, we strongly recommend preprocessing.
FLTL is the technique of choice when the reward requires tracking a
long sequence of events or when the desired behaviour is composed of
many elements with identical rewards. For response formulae, we
advise the use of PLTLSTR if the probability of reaching the goal
is low or achieving the goal is very costly, and conversely, we advise
the use of FLTL if the probability of reaching the triggering
condition is low or if reaching it is very costly. In all cases,
attention should be paid to the syntax of the reward formulae and in
particular to minimising its length. Indeed, as could be expected, we
found the syntax of the formulae and the type of non-Markovian reward
they encode to be a predominant factor in determining the difficulty
of the problem, much more so than the features of the Markovian
dynamics of the domain.
Sylvie Thiebaux
2006-01-20