Thus far, we have not considered what happens once a policy achieves its goal. Since agents rarely set out to achieve a goal and die, we now want to consider how to account for extended activity involving many goals.
One important class of extended activities is when an agent transforms a whole class of identical objects. We will call this metabolizing the class. Metabolism can be useful or it can make extra work: cooking 100 eggs is useful, at least if you are feeding a lot of people; dirtying 100 forks, however, probably means you have to wash them all.
Whether a policy metabolizes an object class depends in large part on
the binding map it uses. The policy metabolizes its
materials because the material being worked on ceases to be the
leftmost pregoal material as soon as it arrives in a goal state. When
this happens,
changes bindings and the agent starts to work on a
different object. Policy p never actually sees a material in a goal
state. Of course, the property of being ``leftmost'' is an artifact
of our formalism. What matters to the property of metabolism is
simply that the binding map implement some ordering on the instances
of the material and always choose the minimum under that ordering of
the objects that are in pre-goal states. Such an ordering might be
implemented by the agent visually scanning its work surface for an
uncooked egg, but always scanning left-to-right and top-to-bottom. We
will return to these issues in section 8.
Other binding maps lead to other kinds of behavior, some of which are pathological. If the binding map always chooses the same binding, then metabolism ceases. If the binding map always chooses uncooked eggs but doesn't impose any ordering on them, it might start cooking an infinite number of eggs without ever actually finishing any one of them.
Metabolism is also an issue for tool use. To metabolize its
materials, must repeatedly reset its tools. An alternate
policy is to metabolize the tools too. Let us define
to be the
binding map that uses not only the leftmost pregoal material but also
the leftmost reset tools. Then clearly,
is a solution from any state for which is defined. This policy
treats tools as disposable. So long as there is an infinite
supply of fresh tools, p will see a succession of states in which
tools are in their reset states. It will never need to execute a
resetting action and so the environment is effectively a
single-state-tool environment. Thus the reduction of section
7.1.3 is unnecessary.