A ground-breaking system for Robotic Soccer, and the one that served as the inspiration for our work, is the Dynamo System developed at the University of British Columbia [Sahota, Mackworth, Barman, KingdonSahota et al.1995]. This system was designed to be capable of supporting several robots per team, but most work has been done in a 1 vs. 1 scenario. Sahota used this system to introduce a decision making strategy called reactive deliberation which was used to choose from among seven hard-wired behaviors [SahotaSahota1993]. Although this approach worked well for a specific task, ML is needed in order to avoid the cumbersome task of hard-wiring the robots for every new situation and also for expanding into the more complex multiagent environment.
Modeled closely after the Dynamo system, the authors have also developed a real-world robotic soccer system [Achim, Stone, VelosoAchim et al.1996]. The main differences from the Dynamo system are that the robots are much smaller, there are five on each team, and they use Infra-red communication rather than radio frequency.
The Robotic Soccer system being developed in Asada's lab is very different from both the Dynamo system and from our own [Asada, Noda, Tawaratsumida, HosodaAsada et al.1994a, Asada, Uchibe, Noda, Tawaratsumida, HosodaAsada et al.1994b]. Asada's robots are larger and are equipped with on-board sensing capabilities. They have been used to develop some low-level behaviors such as shooting and avoiding as well as a RL technique for combining behaviors [Asada, Noda, Tawaratsumida, HosodaAsada et al.1994a, Asada, Uchibe, Noda, Tawaratsumida, HosodaAsada et al.1994b]. By reducing the state space significantly, he was able to use RL to learn to shoot a stationary ball into a goal. His best result in simulation is a 70% scoring rate. He has also done some work on combining different learned behaviors with a separate learned decision mechanism on top [Asada, Uchibe, Noda, Tawaratsumida, HosodaAsada et al.1994b]. While the goals of this research are very similar to our own, the approach is different. Asada has developed a sophisticated robot system with many advanced capabilities, while we have chosen to focus on producing a simple, robust design that will enable us to concentrate our efforts on learning low-level behaviors and high-level strategies. We believe that both approaches are valuable for advancing the state of the art of robotic soccer research.
Although real robotic systems, such as those mentioned above and the many new ones being built for robotic soccer tournaments [KimKim1996, Kitano, Asada, Kuniyoshi, Noda, OsawaKitano et al.1995], are needed for studying certain robotic issues, it is often possible to conduct research more efficiently in a well-designed simulator. Several researchers have previously used simulated robotic soccer to study ML applications. Using the Dynasim soccer simulator [SahotaSahota1996, SahotaSahota1993], Ford et al. used a Reinforcement Learning (RL) approach with sensory predicates to learn to choose among low-level behaviors [Ford, Boutilier, KanazawaFord et al.1994]. Using the simulator described below, Stone and Veloso used Memory-based Learning to allow a player to learn when to shoot and when to pass the ball [Stone VelosoStone Veloso1996a]. In the RoboCup Soccer Server [NodaNoda1995] Matsubar et al. also used a neural network to allow a player to learn when to shoot and when to pass [Matsubara, Noda, HirakiMatsubara et al.1996].
The work described in this article builds on previous work by learning a more difficult behavior: shooting a moving ball into a goal. The behavior is specifically designed to be useful for more complex multiagent behaviors as described in Section 5.