Although we are pursuing research directions with our real robotic system [Achim, Stone, VelosoAchim et al.1996], the simulator facilitates extensive training and testing of learning methods. All of the research reported in this article was conducted in simulation.
Both the simulator and the real-world system are based closely on systems designed by the Laboratory for Computational Intelligence at the University of British Columbia [Sahota, Mackworth, Barman, KingdonSahota et al.1995]. In particular, the simulator's code is adapted from their own code, for which we thank Michael Sahota whose work [SahotaSahota1993] and personal correspondence [SahotaSahota1994] has been motivating and invaluable. The simulator facilitates the control of any number of agents and a ball within a designated playing area. Care has been taken to ensure that the simulator models real-world responses (friction, conservation of momentum, etc.) as closely as possible. Sensor noise of variable magnitude can be included to model more or less precise real systems. A graphic display allows the researcher to watch the action in progress, or the graphics can be toggled off to speed up the rate of the experiments. Figure 2(a) shows the simulator graphics.
Figure 2: (a) The graphic view of our simulator. Eventually
teams of five or more agents will compete in a real-time game of robotic
soccer. (b) The interface between the clients and the simulator.
The simulator is based on a client-server model in which the server models the real world and reports the state of the world while the clients control the individual agents (see Figure 2(b)). Since the moving objects in the world (i.e. the agents and the ball) all have a position and orientation, the simulator describes the current state of the world to the clients by reporting the x, y, and coordinates of each of the objects indicating their positions and orientations. The clients periodically send throttle and steering commands to the simulator indicating how the agents should move. The simulator's job is to correctly model the motion of the agents based on these commands as well as the motion of the ball based on its collisions with agents and walls. The parameters modeled and the underlying physics are described in more detail in Appendix A.
The clients are equipped with a path planning capability that allows them to follow the shortest path between two positions [LatombeLatombe1991, Reeds SheppReeds \ Shepp1991]. For the purposes of this article, the only path planning needed is the ability to steer along a straight line. This task is not trivial since the client can only control the agent's motion at discrete time intervals. Our algorithm controls the agent's steering based on its offset from the correct heading as well as its distance from the line to be followed. This algorithm allows the agent to be steering in exactly the right direction within a small distance from the line after a short adjustment period. Thus the agent is able to reliably strike the ball in a given direction. For more details on the line following method, see Appendix B.