Next: Flexible Team Structure Up: The CMUnited-97 Simulator Team Previous: Predictive Memory

Layered Learning

Once the world model is successfully created, the agents must use it to respond effectively to the environment. As described in Section 2, internal behaviors update the internal state while external behaviors produce executable actuator commands. Spanning both internal and external behaviors, layered learning [6] is our bottom-up hierarchical approach to client behaviors that allows for machine learning at the various levels (Figure 3). The key points of the layered learning technique are as follows:

Figure 3: An overview of the Layered Learning framework. It is designed for use in domains that are too complex to learn a mapping straight from sensors to actuators. We use a hierarchical, bottom-up approach

The difficult aspects of the domain determine which behaviors are to be learned.
The learned behaviors are combined in a vertical fashion, one being used as a part of the other.

Table 1 illustrates possible behavior levels within the robotic soccer domain. Because of the complexity of the domain, it is futile to try to learn intelligent behaviors straight from the primitives provided by the server. Instead, we identified useful low-level skills that must be learned before moving on to higher level strategies. Using our own experience and insights to help the clients learn, we acted as human coaches do when they teach young children how to play real soccer.

table105
Table 1: Examples of different behavior levels.

The low-level behaviors, such as ball interception and passing, are external behaviors involving direct action in the environment. Higher level behaviors, such as strategic positioning and adaptation, are internal behaviors involving changes to the agent's internal state. The type of learning used at each level depends upon the task characteristics. We have used neural networks and decision trees to learn ball interception and passing respectively [6]. These off-line approaches are appropriate for opponent-independent tasks that can be trained outside of game situations. We are using on-line reinforcement learning approaches for behaviors that depend on the opponents. Adversarial actions are clearly opponent-dependent. Team collaboration and action selection can also benefit from adaptation to particular opponents.

Next: Flexible Team Structure Up: The CMUnited-97 Simulator Team Previous: Predictive Memory

Peter Stone
Sun Dec 7 06:54:15 EST 1997