Unfortunately, the MDPs produced by PLTLSIM are far from
minimal. Although they could be postprocessed for minimisation before
invoking the MDP solution method, the above expansion may still
constitute a serious bottleneck. Therefore,
Bacchus et al. [2] consider a more complex two-phase translation, which we call
PLTLMIN, capable of producing an MDP also satisfying property
(3). Here, a preprocessing phase iterates over all states in , and
computes, for each state , a set of subformulae, where the
function is the solution of the fixpoint equation
. Only subformulae in
will be
candidates for inclusion in the sets labelling the respective e-states
labelled with . That is, the subsequent expansion phase will be as
above, but taking
and
instead of
and
. As the subformulae in are exactly
those that are relevant to the way feasible execution sequences
starting from e-states labelled with are rewarded, this leads the
expansion phase to produce a minimal equivalent MDP.
Figure 5 shows the equivalent MDP produced by
PLTLMIN for the NMRDP example in Figure 3, together with
the function from which the labels are built. Observe how this MDP
is smaller than the PLTLSIM MDP: once we reach the state on the
left-hand side in which is true and is false, there is no
point in tracking the values of subformulae, because cannot become
true and so the reward formula cannot either. This is reflected by the
fact that only contains the reward formula.
In the worst case, computing requires a space, and a number of
iterations through , exponential in . Hence the question
arises of whether the gain during the expansion phase is worth the
extra complexity of the preprocessing phase. This is one of the
questions our experimental analysis in Section 5 will try to answer.
Figure 4:
Equivalent MDP Produced by PLTLSIM
|
Figure 5:
Equivalent MDP Produced by PLTLMIN
|
Sylvie Thiebaux
2006-01-20