PLTLSIM

For the choice of the $\Psi$ s, Bacchus et al. [2] consider two cases. In the simple case, which we call PLTLSIM, an MDP obeying properties (1) and (2) is produced by simply labelling each e-state with the set of all subformulae in $\overline{\mbox{Sub}(F)}$ which are true of the sequence leading to that e-state. This MDP is generated forward, starting from the initial e-state labelled with

and with the set $\Psi_0 \subseteq \overline{\mbox{Sub}(F)}$ of all subformulae which are true of the sequence $\langle s_0 \rangle$ . The successors of any e-state labelled by NMRDP state

and subformula set $\Psi$ are generated as follows: each of them is labelled by a successor

in the NMRDP and by the set of subformulae $\{\psi' \in \overline{\mbox{Sub}(F)} \mid \Psi \models \mbox{Reg}(\psi',s')\}$ . For instance, consider the NMRDP shown in Figure 3. The set $F = \{q \wedge \circleddash \circleddash p\}$ consists of a single reward formula. The set $\overline{\mbox{Sub}(F)}$ consists of all subformulae of this reward formula, and their negations, that is $\overline{\mbox{Sub}(F)}= \{p,q,\circleddash p, \circleddash \circleddash p, q ... ...neg \circleddash \circleddash p, \neg (q \wedge \circleddash \circleddash p)\}$ . The equivalent MDP produced by PLTLSIM is shown in Figure 4.

**Figure 3:** Another Simple NMRDP
$\textstyle \parbox{0.5\textwidth}{\includegraphics[height=0.18\textheight]{figures/pq}}$ $\textstyle \parbox{0.4\textwidth}{\footnotesize In the initial state, both $p$\... ...til $p$\ gets produced, and then to apply $a$\ or $b$\ indifferently forever. }$

Sylvie Thiebaux 2006-01-20