My dissertation is available as CMU Computer Science Tech Report CMU-CS-98-187:
Layered Learning in Multi-Agent Systems
An updated version is available as a book from
MIT Press
by Peter Stone
MIT Press, 2000.
Manuela M. Veloso, Chair
Andrew W. Moore
Herbert A. Simon
Victor R. Lesser (University of Massachusetts, Amherst) Layered Learning in Multiagent Systems:
A Winning Approach to Robotic Soccer
ISBN: 0262194384
First, the thesis defines a team member agent architecture within which a flexible team structure is presented, allowing agents to decompose the task space into flexible roles and allowing them to smoothly switch roles while acting. Team organization is achieved by the introduction of a locker-room agreement as a collection of conventions followed by all team members. It defines agent roles, team formations, and pre-compiled multi-agent plans. In addition, the team member agent architecture includes a communication paradigm for domains with single-channel, low-bandwidth, unreliable communication. The communication paradigm facilitates team coordination while being robust to lost messages and active interference from opponents.
Second, the thesis introduces layered learning, a general-purpose machine learning paradigm for complex domains in which learning a mapping directly from agents' sensors to their actuators is intractable. Given a hierarchical task decomposition, layered learning allows for learning at each level of the hierarchy, with learning at each level directly affecting learning at the next higher level.
Third, the thesis introduces a new multi-agent reinforcement learning algorithm, namely team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL is designed for domains in which agents cannot necessarily observe the state changes when other team members act. It exploits local, action-dependent features to aggressively generalize its input representation for learning and partitions the task among the agents, allowing them to simultaneously learn collaborative policies by observing the long-term effects of their actions.
Fourth, the thesis contributes a fully functioning multi-agent system that incorporates learning in a real-time, noisy domain with teammates and adversaries. Detailed algorithmic descriptions of the agents' behaviors as well as their source code are included in the thesis.
Empirical results validate all four contributions within the simulated robotic soccer domain. The generality of the contributions is verified by applying them to the real robotic soccer, and network routing domains. Ultimately, this dissertation demonstrates that by learning portions of their cognitive processes, selectively communicating, and coordinating their behaviors via common knowledge, a group of independent agents can work towards a common goal in a complex, real-time, noisy, collaborative, and adversarial environment.
Thesis
The complete dissertation in one file:
thesis.ps.gz (253 pages)
Each section in its own file:
Abstract and Contents (pp. 1-18)
Chapter 1: Introduction (pp. 19-24)
1.1 Motivation
1.2 Objectives and Approach
1.3 Contributions
1.4 Reader's Guide to the Thesis
Chapter 2: Substrate Systems (pp. 25-52)
2.1 Overview
2.2 The RoboCup Soccer Server
2.3 The CMUnited-97 Real Robots
2.4 Network Routing
Chapter 3: Team Member Agent Architecture (pp. 53-90)
3.1 Periodic Team Synchronization (PTS) Domains
3.2 Architecture Overview
3.3 Teamwork Structure
3.4 Communication Paradigm
3.5 Implementation in Robotic Soccer
3.6 Results
3.7 Transfer to Real Robots
3.8 Discussion and Related Work
Chapter 4: Layered Learning (pp. 91-104)
4.1 Principles
4.2 Instantiation in Simulated Robotic Soccer
4.3 Discussion
4.4 Related Work
Chapter 5: Learning an Individual Skill (pp. 105-114)
5.1 Ball Interception in the Soccer Server
5.2 Training
5.3 Results
5.4 Discussion
5.5 Related Work
Chapter 6: Learning a Multi-Agent Behavior (pp. 115-134)
6.1 Decision Tree Learning for Pass Evaluation
6.2 Using the Learned Behaviors
6.3 Scaling up to Full Games
6.4 Discussion
6.5 Related Work
Chapter 7: Learning a Team Behavior (pp. 135-168)
7.1 Motivation
7.2 TPOT-RL
7.3 TPOT-RL Applied to Simulated Robotic Soccer
7.4 TPOT-RL Applied to Network Routing
7.5 Discussion
7.6 Related Work
Chapter 8: Competition Results (pp. 169-180)
8.1 Pre-RoboCup-96
8.2 MiroSot-96
8.3 RoboCup-97
8.4 RoboCup-98
8.5 Lessons Learned from Competitions
Chapter 9: Related Work (pp. 181-208)
9.1 MAS from an ML Perspective
9.2 Robotic Soccer
Chapter 10: Conclusion (pp. 209-214)
10.1 Contributions
10.2 Future Directions
10.3 Concluding Remarks
Appendices (pp. 215-234)
A List of Acronyms
B Robotic Soccer Agent Skills
B.1 CMUnited-98 Simulator Agent Skills
B.2 CMUnited-97 Small-Robot Skills
C CMUnited-98 Simulator Team Behavior Modes
C.1 Conditions
C.2 Effects
D CMUnited Simulator Team Source Code
Bibliography (pp. 235-253)
On-line Appendix