A Class of Reward-Normal Formulae
The existing decision procedure [41] for determining
whether a formula is reward-normal is guaranteed to terminate
finitely, but involves the construction and comparison of automata and
is rather intricate in practice. It is therefore useful to give a
simple syntactic characterisation of a set of constructors for
obtaining reward-normal formulae even though not all such formulae are
so constructible.
We say that a formula is material iff it contains no
and no temporal operators - that is, the material formulae are the
boolean combinations of atoms.
We consider four operations on behaviours representable by formulae of
$FLTL. Firstly, a behaviour may be delayed for a specified number of
timesteps. Secondly, it may be made conditional on a material trigger.
Thirdly, it may be started repeatedly until a material termination
condition is met. Fourthly, two behaviours may be combined to form
their union. These operations are easily realised syntactically by
corresponding operations on formulae. Where is any material
formula:
We have shown [41] that the set of reward-normal formulae
is closed under delay, cond (for any material ), loop (for any material ) and union, and also that the
closure of under these operations represents a class of
behaviours closed under intersection and concatenation as well as
union.
Many familiar reward-normal formulae are obtainable from by
applying the four operations. For example,
is
.
Sometimes a paraphrase is necessary. For example,
is not of the required form
because of the
in the antecedent of the conditional, but the
equivalent
is
. Other cases are not so
easy. An example is the formula
which stipulates a reward the first time happens and which is not
at all of the form suggested. To capture the same behaviour using the
above operations requires a formula like
.
Sylvie Thiebaux
2006-01-20