References

Next: About this document Up: Accelerating Reinforcement Learning by Previous: Appendix A. Spline Representations

References

Chin & Dyer (1986): Chin, C. H. & Dyer, C. R. (1986). Model-based recognition in robot vision. Computing Surveys, 18(1), 67-108.
Christiansen (1992): Christiansen, A. D. (1992). Learning to predict in uncertain continuous tasks. In Proceedings of the Ninth International Workshop on Machine Learning, 72-81.
Cohen & Cohen (1993): Cohen, L. D. & Cohen, I. (1993). Finite element methods for active contour models and balloons for 2-d and 3-d images. IEEE Transactions On Pattern Analysis And Machine Intelligence, 15(11), 1131-1147.
Dijkstra (1959): Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1, 269-271.
Drummond (1996): Drummond, C. (1996). Preventing overshoot of splines with application to reinforcement learning. Computer science technical report TR-96-05, School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, Canada.
Drummond (1997): Drummond, C. (1997). Using a case-base of surfaces to speed-up reinforcement learning. In Proceedings of the Second International Conference on Case-Based Reasoning, 1266 of LNAI, 435-444.
Drummond (1998): Drummond, C. (1998). Composing functions to speed up reinforcement learning in a changing world. In Proceedings of the Tenth European Conference on Machine Learning, 1398 of LNAI, 370-381.
Drummond (1999): Drummond, C. (1999). A Symbol's Role in Learning Low Level Control Functions. Ph.D. thesis, School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, Canada.
Galil (1986): Galil, Z. (1986). Efficient algorithms for finding maximum matching in graphs. ACM Computing Surveys, 18(1), 23-38.
Gold & Rangarajan (1996): Gold, S. & Rangarajan, A. (1996). A graduated assignment algorithm for graph matching. IEEE Transactions On Pattern Analysis And Machine Intelligence, 18(4), 377-388.
Gordon (1995): Gordon, G. J. (1995). Stable function approximation in dynamic programming. In Proceedings of the Twelfth International Conference of Machine Learning, 261-268.
Gordon & Segre (1996): Gordon, G. J. & Segre, A. M. (1996). Nonparametric statistical methods for experimental evaluations of speedup learning. In Proceedings of the Thirteenth International Conference of Machine Learning, 200-206.
Hammond (1990): Hammond, K. J. (1990). Case-based planning: A framework for planning from experience. The Journal of Cognitive Science, 14(3), 385-443.
Hauskrecht, Meuleau, Boutilier, Kaelbling, & Dean (1998): Hauskrecht, M., Meuleau, N., Boutilier, C., Kaelbling, L. P., & Dean, T. (1998). Hierarchical solution for Markov decision processes using macro-actions. In Proceedings of the Fourteenth Conference on Uncertainty In Artificial Intelligence, 220-229.
Kass, Witkin, & Terzopoulus (1987): Kass, M., Witkin, A., & Terzopoulus, D. (1987). Snakes: Active contour models. International Journal of Computer Vision, 1, 321-331.
Leroy, Herlin, & Cohen (1996): Leroy, B., Herlin, I. L., & Cohen, L. D. (1996). Multi-resolution algorithms for active contour models. In Proceedings of the Twelfth International Conference on Analysis and Optimization of Systems, 58-65.
Leymarie & Levine (1993): Leymarie, F. & Levine, M. D. (1993). Tracking deformable objects in the plane using an active contour model. IEEE Transactions On Pattern Analysis And Machine Intelligence, 15(6), 617-634.
MacDonald (1992): MacDonald, A. (1992). Graphs: Notes on symetries, imbeddings, decompositions. Electrical Engineering Department TR-92-10-AJM, Brunel University, Uxbridge, Middlesex, United Kingdom.
Mahadevan & Connell (1992): Mahadevan, S. & Connell, J. (1992). Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55, 311-365.
Mallat & Zhong (1992): Mallat, S. & Zhong, S. (1992). Characterization of signals from multiscale edges. IEEE Transactions On Pattern Analysis And Machine Intelligence, 14(7), 710-732.
Marr (1982): Marr, D. (1982). Vision: a Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman.
McCallum (1995a): McCallum, R. A. (1995a). Instance-based state identification for reinforcement learning. In Advances in Neural Information Processing Systems 7, 377-384.
McCallum (1995b): McCallum, R. A. (1995b). Instance-based utile distinctions for reinforcement learning with hidden state. In Proceedings of the Twelfth International Conference on Machine Learning, 387-395.
Moore & Atkeson (1993): Moore, A. W. & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 103-130.
Moore (1992): Moore, A. W. (1992). Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state spaces. In Proceedings of the Ninth International Workshop on Machine Learning.
Nason (1995): Nason, G. (1995). Three-dimensional projection pursuit. , Department of Mathematics, University of Bristol, Bristol, United Kingdom.
Osborne & Bridge (1997): Osborne, H. & Bridge, D. (1997). Similarity metrics: A formal unification of cardinal and non-cardinal similarity measures. In Proceedings of the Second International Conference on Case-Based Reasoning, 1266 of LNAI, 235-244.
Parr (1998): Parr, R. (1998). Flexible decomposition algorithms for weakly coupled Markov decision problems. In Proceedings of the Fourteenth Conference on Uncertainty In Artificial Intelligence, 422-430.
Peng (1995): Peng, J. (1995). Efficient memory-based dynamic programming. In Proceedings of the Twelfth International Conference of Machine Learning, 438-439.
Precup, Sutton, & Singh (1997): Precup, D., Sutton, R. S., & Singh, S. P. (1997). Planning with closed-loop macro actions. In Working notes of the 1997 AAAI Fall Symposium on Model-directed Autonomous Systems, 70-76.
Precup, Sutton, & Singh (1998): Precup, D., Sutton, R. S., & Singh, S. P. (1998). Theoretical results on reinforcement learning with temporally abstract options. In Proceedings of the Tenth European Conference on Machine Learning, 1398 of LNAI, 382-393.
Schnabel (1997): Schnabel, J. A. (1997). Multi-Scale Active Shape Description in Medical Imaging. Ph.D. thesis, University of London, London, United Kingdom.
Sheppard & Salzberg (1997): Sheppard, J. W. & Salzberg, S. L. (1997). A teaching strategy for memory-based control. Artificial Intelligence Review: Special Issue on Lazy Learning, 11, 343-370.
Singh & Sutton (1996): Singh, S. P. & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123-158.
Singh (1992): Singh, S. P. (1992). Reinforcement learning with a hierarchy of abstract models. In Proceedings of the Tenth National Conference on Artificial Intelligence, 202-207.
Suetens, Fua, & Hanson (1992): Suetens, P., Fua, P., & Hanson, A. (1992). Computational strategies for object recognition. Computing Surveys, 24(1), 5-61.
Sutton (1990): Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, 216-224.
Sutton (1996): Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. In Advances in Neural Information Processing Systems 8, 1038-1044.
Sutton & Barto (1998): Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.
Tadepalli & Ok (1996): Tadepalli, P. & Ok, D. (1996). Scaling up average reward reinforcement learning by approximating the domain models and the value function. In Proceedings of the Thirteenth International Conference of Machine Learning, 471-479.
Tanimoto (1990): Tanimoto, S. L. (1990). The Elements of Artficial Intelligence. W.H. Freeman.
Terzopoulos (1986): Terzopoulos, D. (1986). Regularization of inverse visual problems involving discontinuities. IEEE Transactions On Pattern Analysis And Machine Intelligence, 8(4), 413-423.
Thrun & Schwartz (1994): Thrun, S. & Schwartz, A. (1994). Finding structure in reinforcement learning. In Advances in Neural Information Processing Systems 7, 385-392.
Veloso & Carbonell (1993): Veloso, M. M. & Carbonell, J. G. (1993). Derivational analogy in prodigy: Automating case acquisition, storage and utilization. Machine Learning, 10(3), 249-278.
Watkins & Dayan (1992): Watkins, C. J. & Dayan, P. (1992). Technical note: Q-learning. Machine Learning, 8(3-4), 279-292.

Chris Drummond
Thursday January 31 01:30:31 EST 2002