In reinforcement-learning practice, some simple, ad hoc strategies have been popular. They are rarely, if ever, the best choice for the models of optimality we have used, but they may be viewed as reasonable, computationally tractable, heuristics. Thrun [124] has surveyed a variety of these techniques.