Given the basic policy for cooking a single egg with a single pan and whisk, we can construct a policy to achieve the goal by composing the basic policy with a binding. This policy will solve the goal from any state in which the bound material is in a non-postgoal state. For a policy to solve the goal from any solvable state, it must be able to change bindings at run time. We will call a function from states to bindings a binding map.
One simple policy for choosing bindings is to impose some a priori ordering on the objects and always use the first acceptable object in the ordering. The ordering might be random, or it might correspond to order imposed by a visual search mechanism. From a formal standpoint, the ordering does not matter, so we can, without loss of generality, use the left-to-right order of state components in the environment's state tuple. Let be some binding map that always chooses the leftmost pregoal material and uses some fixed mapping for the tools (we do not care what). This mapping allows us to construct a true solution, and one that requires no internal state in the agent:
Proof: By assumption, is defined in the initial state. The environment must then map to a solvable state under in the initial state. Since p is, by assumption, a solution for the problem in E, must solve the problem in E' unless changes value before the can solve the problem. Suppose it does. Then the environment must go from a state , in which some state component of E' is the leftmost pregoal material, to a state , in which some other component is the leftmost pregoal material. This can only happen if (a) the leftmost pregoal material in is changed to be in a goal state in or (b) some other component that was not pregoal in becomes pregoal in . Case (b) is impossible and case (a) implies that is itself a goal state. Thus must be a solution.