Bibliography

Arkin, 1998: Arkin, R. C. (1998).
Behavior-Based Robotics.
Intelligent Robotics and Autonomous Agents. The MIT Press.
Bellman, 1957: Bellman, R. E. (1957).
Dynamic Programming.
Princeton University Press, Princeton.
Boutilier et al., 1999: Boutilier, C., Dean, T., and Hanks, S. (1999).
Decision-theoretic planning: Structural assumptions and computational leverage.
Journal of Artificial Intelligence Research, 11:1-94.
Brooks, 1991: Brooks, R. A. (1991).
Intelligence without representation.
Artificial Intelligence, 47:139-159.
Butz, 1999: Butz, M. (1999).
C-XCS: An implementation of the XCS in C.
(http://www.cs.bath.ac.uk/ amb/LCSWEB/computer.htm).
Celaya and Porta, 1996: Celaya, E. and Porta, J. M. (1996).
Control of a six-legged robot walking on abrupt terrain.
In Proceedings of the IEEE International Conference on Robotics and Automation, pages 2731-2736.
Celaya and Porta, 1998: Celaya, E. and Porta, J. M. (1998).
A control structure for the locomotion of a legged robot on difficult terrain.
IEEE Robotics and Automation Magazine, Special Issue on Walking Robots, 5(2):43-51.
Chapman and Kaelbling, 1991: Chapman, D. and Kaelbling, L. P. (1991).
Input generalization in delayed reinforcement learning: An algorithm and performance comparisons.
In Proceedings of the International Joint Conference on Artificial Intelligence, pages 726-731.
Claus and Boutilier, 1998: Claus, C. and Boutilier, C. (1998).
The dynamics of reinforcement learning in cooperative multiagent systems.
In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 746-752. American Association for Artificial Intelligence.
Drummond, 2002: Drummond, C. (2002).
Accelerating reinforcement learning by composing solutions of automatically identified subtasks.
Journal of Artificial Intelligence Research, 16:59-104.
Edelman, 1989: Edelman, G. M. (1989).
Neuronal Darwinism.
Oxford University Press.
Hinton et al., 1986: Hinton, G., McClelland, J., and Rumelhart, D. (1986).
Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, chapter Distributed Representations.
MIT Press, Cambridge, MA.
Ilg et al., 1997: Ilg, W., Mühlfriedel, T., and Berns, K. (1997).
Hybrid learning architecture based on neural networks for adaptive control of a walking machine.
In Proceedings of the 1997 IEEE International Conference on Robotics and Automation, pages 2626-2631.
Kaelbling, 1993: Kaelbling, L. P. (1993).
Learning in Embedded Systems.
A Bradford Book. The MIT Press, Cambridge MA.
Kaelbling et al., 1996: Kaelbling, L. P., Littman, M. L., and Moore, A. W. (1996).
Reinforcement learning: A survey.
Journal of Artificial Intelligence Research, 4:237 - 285.
Kanerva, 1988: Kanerva, P. (1988).
Sparse Distributed Memory.
MIT Press, Cambridge, MA.
Kirchner, 1998: Kirchner, F. (1998).
Q-learning of complex behaviors on a six-legged walking machine.
Robotics and Autonomous Systems, 25:253-262.
Kodjabachia and Meyer, 1998: Kodjabachia, J. and Meyer, J. A. (1998).
Evolution and development of modular control architectures for 1-d locomotion in six-legged animats.
Connection Science, 2:211-237.
Maes and Brooks, 1990: Maes, P. and Brooks, R. A. (1990).
Learning to coordinate behaviors.
In Proceedings of the AAAI-90, pages 796-802.
Mahadevan and Connell, 1992: Mahadevan, S. and Connell, J. H. (1992).
Automatic programming of behavior-based robots using reinforcement learning.
Artificial Intelligence, 55:311-363.
McCallum, 1995: McCallum, A. K. (1995).
Reinforcement Learning with Selective Perception and Hidden State.
PhD thesis, Department of Computer Science.
Parker, 2000: Parker, G. B. (2000).
Co-evolving model parameters for anytime learning in evolutionary robotics.
Robotics and Autonomous Systems, 33:13-30.
Pendrith and Ryan, 1996: Pendrith, M. D. and Ryan, M. R. K. (1996).
C-trace: A new algorithm for reinforcement learning of robotic control.
In Proceedings of the 1996 International Workshop on Learning for Autonomous Robots (Robotlearn-96).
Poggio and Girosi, 1990: Poggio, T. and Girosi, F. (1990).
Regularization algorithms for learning that are equivalent to multilayer networks.
Science, (247):978-982.
Schmidhuber, 2002: Schmidhuber, J. (2002).
The speed prior: A new simplicity measure yielding near-optimal computable predictions.
In Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2OO2). Lecture Notes In Artificial Intelligence. Springer., pages 216-228.
Sen, 1994: Sen, S. (1994).
Learning to coordinate without sharing information.
In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 426-431. American Association for Artificial Intelligence.
Sutton, 1996: Sutton, R. (1996).
Generalization in reinforcement learning: Successful examples using sparse coarse coding.
In Proceedings of the 1995 Conference on Advances in Neural Information Processing, pages 1038-1044.
Sutton et al., 1999: Sutton, R., Precup, D., and Singh, S. (1999).
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning.
Artificial Intelligence, 12:181-211.
Sutton, 1991: Sutton, R. S. (1991).
Reinforcement learning architectures for animats.
In Meyer, J. A. and Wilson, S. W., editors, Proceedings of the First International Conference on Simulation of Adaptive Behavior. From Animals to Animats, pages 288-296. The MIT Press, Bradford Books.
Sutton and Barto, 1998: Sutton, R. S. and Barto, A. G. (1998).
Reinforcement Learning: An Introduction.
A Bradford Book. The MIT Press.
Sutton and Whitehead, 1993: Sutton, R. S. and Whitehead, S. D. (1993).
Online learning with random representations.
In Proceedings of the Eleventh International Conference on Machine Learning, pages 314-321. Morgan Kaufman, San Francisco, CA.
Tan, 1997: Tan, M. (1997).
Multi-agent reinforcement learning: Independent vs. cooperative agents.
In Reading in Agents, pages 487-494. Morgan Kaufmann Publishers Inc.
Vallejo and Ramos, 2000: Vallejo, E. E. and Ramos, F. (2000).
A distributed genetic programming architecture for the evolution of robust insect locomotion controllers.
In Meyer, J. A., Berthoz, A., Floreano, D., Roitblat, H. L., and Wilson, S. W., editors, Supplement Proceedings of the Sixth International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 235-244. The International Society for Adaptive Behavior.
Venturini, 1994: Venturini, G. (1994).
Apprentissage Adaptatif et Apprentissage Supervisé par Algorithme Génétique.
PhD thesis.
Watkins and Dayan, 1992: Watkins, C. J. C. H. and Dayan, P. (1992).
Q-learning.
Machine Learning, 8:279-292.
Widrow and Hoff, 1960: Widrow, B. and Hoff, M. (1960).
Adaptive switching circuits.
In Western Electronic Show and Convention, Volume 4, pages 96-104. Institute of Radio Engineers (now IEEE).
Wilson, 1995: Wilson, S. W. (1995).
Classifier fitness based on accuracy.
Evolutionary Computation, 3:149-175.
Wilson, 1996: Wilson, S. W. (1996).
Explore/exploit strategies in autonomy.
In From Animals to Animats 4: Proceedings of the 4th International Conference on Simulation of Adaptive Behavior, pages 325-332.

Josep M Porta 2005-02-17