Recently, this approach has been reexamined using insights from the reinforcement-learning literature, with some success. Dorigo did a comparative study of Q-learning and classifier systems [36]. Cliff and Ross [26] start with Wilson's zeroth-level classifier system [135] and add one and two-bit memory registers. They find that, although their system can learn to use short-term memory registers effectively, the approach is unlikely to scale to more complex environments.
Dorigo and Colombetti applied classifier systems to a moderately complex problem of learning robot behavior from immediate reinforcement [38, 37].