The experiments reported Section 4 indicate that the confidence factors provided by standard DT software can be used for effective agent control. Combined with some basic reasoning about the action-execution times of different options--necessitated by the real-time nature of this domain, the DT-based control function outperformed both random and hand-coded alternatives. Even though the DT was trained in a limited artificial situation, it was useful for agent control in a broader scenario.
Throughout this paper, the multiagent behaviors are tested against an opponent that leaves one side of the field free, while covering the other side heavily. This opponent simulates a situation in which the players without the ball make an effort to move to an open position on the field. Such collaborative reasoning has not yet been implemented in the Soccer Server. However, the fact that the DT is able to exploit open players indicates that reasoning about field positioning when a teammate has the ball would be a useful next step in the development of learned collaborative behaviors.
Along with more variable field positioning, there is still a great deal of future work to be done in this domain. First, one could build additional learned layers on top of the NN and DT layers described in Section 2. The behavior used in this paper uses the DT as a part of a hand-coded high-level multiagent behavior. However, several parameters are arbitrarily chosen. A behavior that learns how to map the classifications and confidence factors of the DT to passing/dribbling/shooting decisions may perform better. Second, on-line adversarial learning methods that can adapt to opponent behaviors during the course of a game may be more successful against a broad range of opponents than current methods.
Nevertheless, the incorporation of low-level learning modules into a full multiagent behavior that can be used in game situations is a significant advance towards intelligent multiagent behaviors in a complex real-time domain. Furthermore, the ability to reason about the amount of time available to act is essential in domains with continuously changing state. Finally, as DT confidence factors are effective tools in this domain, they are a new potentially useful tool for agent control in general. These contributions promise an exciting future for learning-based methods in real-time, adversarial, multiagent domains.