next up previous
Next: Further Learning Opportunities Up: Issues and Techniques Previous: Modeling of other agents'

How to affect others

When no communication is possible, system designers must decide how the agents will affect one another. Since they exist in the same environment, the agents can affect each other in several ways. Actively, they can be sensed by other agents, or they may be able to change the state of another agent by, for example, pushing it. More indirectly, agents can affect other agents by one of two types of stigmergy [39]. First, active stigmergy occurs when an agent alters the environment so as to affect the sensory input of another agent. For example, a robotic agent might leave a marker behind it for other agents to observe. Goldman and Rosenschein demonstrate an effective form of active stigmergy in which agents heuristically alter the environment in order to facilitate future unknown plans of other agents [31]. Second, passive stigmergy involves altering the environment so that the effects of another agent's actions change. For example, if one agent turns off the main water valve to a building, the effect of another agent turning on the kitchen faucet is altered.

Holland illustrates the concept of passive stigmergy with a robotic system designed to model the behavior of an ant colony confronted with many dead ants around its nest [39]. An ant from such a colony tends to periodically pick up a dead ant, carry it for a short distance, and then drop it. Although the behavior appears to be random, after several hours, the dead ants are clustered in a small number of heaps. Over time, there are fewer and fewer large piles until all the dead ants end up in one pile. Although the ants behave homogeneously and, at least in this case, we have no evidence that they communicate explicitly, the ants manage to cooperate in achieving a task.

Holland models this situation with a number of identical robots in a small area scattered with pucks [39]. The robots are programmed reactively to move straight (turning at walls) until they are pushing three or more pucks. At that point, the robots back up and turn away, leaving the three pucks in a cluster. Although the robots do not communicate at all, they are able to collect the pucks into a single pile over time. This effect occurs because when a robot approaches an existing pile directly, it adds the pucks it was already carrying to the pile and turns away. Of course a robot approaching an existing pile obliquely might take a puck away from the pile, but over time the desired result is accomplished. Like the ants, the robots use passive stigmergy to affect each other's behavior.

A similar scenario with more deliberative robots is explored by Mataric. In this case, the robots use Q-learning to learn behaviors including foraging for pucks as well as homing and following [52]. The robots learn independent policies, dealing with the high-dimensional state space with the aid of progress estimators that give intermediate rewards, and with the aid of boolean value predicates that condense many states into one. Mataric's robots actively affect each other through observation: a robot learning to follow another robot can base its action on the relative location of the other robot.



next up previous
Next: Further Learning Opportunities Up: Issues and Techniques Previous: Modeling of other agents'



Peter Stone
Wed Sep 24 11:54:14 EDT 1997