next up previous
Next: About this document Up: Accelerating Reinforcement Learning by Previous: Appendix A. Spline Representations

References

Chin & Dyer (1986)
Chin, C. H. & Dyer, C. R. (1986). Model-based recognition in robot vision. Computing Surveys, 18(1), 67-108.

Christiansen (1992)
Christiansen, A. D. (1992). Learning to predict in uncertain continuous tasks. In Proceedings of the Ninth International Workshop on Machine Learning, 72-81.

Cohen & Cohen (1993)
Cohen, L. D. & Cohen, I. (1993). Finite element methods for active contour models and balloons for 2-d and 3-d images. IEEE Transactions On Pattern Analysis And Machine Intelligence, 15(11), 1131-1147.

Dijkstra (1959)
Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1, 269-271.

Drummond (1996)
Drummond, C. (1996). Preventing overshoot of splines with application to reinforcement learning. Computer science technical report TR-96-05, School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, Canada.

Drummond (1997)
Drummond, C. (1997). Using a case-base of surfaces to speed-up reinforcement learning. In Proceedings of the Second International Conference on Case-Based Reasoning, 1266 of LNAI, 435-444.

Drummond (1998)
Drummond, C. (1998). Composing functions to speed up reinforcement learning in a changing world. In Proceedings of the Tenth European Conference on Machine Learning, 1398 of LNAI, 370-381.

Drummond (1999)
Drummond, C. (1999). A Symbol's Role in Learning Low Level Control Functions. Ph.D. thesis, School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, Canada.

Galil (1986)
Galil, Z. (1986). Efficient algorithms for finding maximum matching in graphs. ACM Computing Surveys, 18(1), 23-38.

Gold & Rangarajan (1996)
Gold, S. & Rangarajan, A. (1996). A graduated assignment algorithm for graph matching. IEEE Transactions On Pattern Analysis And Machine Intelligence, 18(4), 377-388.

Gordon (1995)
Gordon, G. J. (1995). Stable function approximation in dynamic programming. In Proceedings of the Twelfth International Conference of Machine Learning, 261-268.

Gordon & Segre (1996)
Gordon, G. J. & Segre, A. M. (1996). Nonparametric statistical methods for experimental evaluations of speedup learning. In Proceedings of the Thirteenth International Conference of Machine Learning, 200-206.

Hammond (1990)
Hammond, K. J. (1990). Case-based planning: A framework for planning from experience. The Journal of Cognitive Science, 14(3), 385-443.

Hauskrecht, Meuleau, Boutilier, Kaelbling, & Dean (1998)
Hauskrecht, M., Meuleau, N., Boutilier, C., Kaelbling, L. P., & Dean, T. (1998). Hierarchical solution for Markov decision processes using macro-actions. In Proceedings of the Fourteenth Conference on Uncertainty In Artificial Intelligence, 220-229.

Kass, Witkin, & Terzopoulus (1987)
Kass, M., Witkin, A., & Terzopoulus, D. (1987). Snakes: Active contour models. International Journal of Computer Vision, 1, 321-331.

Leroy, Herlin, & Cohen (1996)
Leroy, B., Herlin, I. L., & Cohen, L. D. (1996). Multi-resolution algorithms for active contour models. In Proceedings of the Twelfth International Conference on Analysis and Optimization of Systems, 58-65.

Leymarie & Levine (1993)
Leymarie, F. & Levine, M. D. (1993). Tracking deformable objects in the plane using an active contour model. IEEE Transactions On Pattern Analysis And Machine Intelligence, 15(6), 617-634.

MacDonald (1992)
MacDonald, A. (1992). Graphs: Notes on symetries, imbeddings, decompositions. Electrical Engineering Department TR-92-10-AJM, Brunel University, Uxbridge, Middlesex, United Kingdom.

Mahadevan & Connell (1992)
Mahadevan, S. & Connell, J. (1992). Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55, 311-365.

Mallat & Zhong (1992)
Mallat, S. & Zhong, S. (1992). Characterization of signals from multiscale edges. IEEE Transactions On Pattern Analysis And Machine Intelligence, 14(7), 710-732.

Marr (1982)
Marr, D. (1982). Vision: a Computational Investigation into the Human Representation and Processing of Visual Information. W.H. Freeman.

McCallum (1995a)
McCallum, R. A. (1995a). Instance-based state identification for reinforcement learning. In Advances in Neural Information Processing Systems 7, 377-384.

McCallum (1995b)
McCallum, R. A. (1995b). Instance-based utile distinctions for reinforcement learning with hidden state. In Proceedings of the Twelfth International Conference on Machine Learning, 387-395.

Moore & Atkeson (1993)
Moore, A. W. & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 103-130.

Moore (1992)
Moore, A. W. (1992). Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state spaces. In Proceedings of the Ninth International Workshop on Machine Learning.

Nason (1995)
Nason, G. (1995). Three-dimensional projection pursuit. , Department of Mathematics, University of Bristol, Bristol, United Kingdom.

Osborne & Bridge (1997)
Osborne, H. & Bridge, D. (1997). Similarity metrics: A formal unification of cardinal and non-cardinal similarity measures. In Proceedings of the Second International Conference on Case-Based Reasoning, 1266 of LNAI, 235-244.

Parr (1998)
Parr, R. (1998). Flexible decomposition algorithms for weakly coupled Markov decision problems. In Proceedings of the Fourteenth Conference on Uncertainty In Artificial Intelligence, 422-430.

Peng (1995)
Peng, J. (1995). Efficient memory-based dynamic programming. In Proceedings of the Twelfth International Conference of Machine Learning, 438-439.

Precup, Sutton, & Singh (1997)
Precup, D., Sutton, R. S., & Singh, S. P. (1997). Planning with closed-loop macro actions. In Working notes of the 1997 AAAI Fall Symposium on Model-directed Autonomous Systems, 70-76.

Precup, Sutton, & Singh (1998)
Precup, D., Sutton, R. S., & Singh, S. P. (1998). Theoretical results on reinforcement learning with temporally abstract options. In Proceedings of the Tenth European Conference on Machine Learning, 1398 of LNAI, 382-393.

Schnabel (1997)
Schnabel, J. A. (1997). Multi-Scale Active Shape Description in Medical Imaging. Ph.D. thesis, University of London, London, United Kingdom.

Sheppard & Salzberg (1997)
Sheppard, J. W. & Salzberg, S. L. (1997). A teaching strategy for memory-based control. Artificial Intelligence Review: Special Issue on Lazy Learning, 11, 343-370.

Singh & Sutton (1996)
Singh, S. P. & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123-158.

Singh (1992)
Singh, S. P. (1992). Reinforcement learning with a hierarchy of abstract models. In Proceedings of the Tenth National Conference on Artificial Intelligence, 202-207.

Suetens, Fua, & Hanson (1992)
Suetens, P., Fua, P., & Hanson, A. (1992). Computational strategies for object recognition. Computing Surveys, 24(1), 5-61.

Sutton (1990)
Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, 216-224.

Sutton (1996)
Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. In Advances in Neural Information Processing Systems 8, 1038-1044.

Sutton & Barto (1998)
Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.

Tadepalli & Ok (1996)
Tadepalli, P. & Ok, D. (1996). Scaling up average reward reinforcement learning by approximating the domain models and the value function. In Proceedings of the Thirteenth International Conference of Machine Learning, 471-479.

Tanimoto (1990)
Tanimoto, S. L. (1990). The Elements of Artficial Intelligence. W.H. Freeman.

Terzopoulos (1986)
Terzopoulos, D. (1986). Regularization of inverse visual problems involving discontinuities. IEEE Transactions On Pattern Analysis And Machine Intelligence, 8(4), 413-423.

Thrun & Schwartz (1994)
Thrun, S. & Schwartz, A. (1994). Finding structure in reinforcement learning. In Advances in Neural Information Processing Systems 7, 385-392.

Veloso & Carbonell (1993)
Veloso, M. M. & Carbonell, J. G. (1993). Derivational analogy in prodigy: Automating case acquisition, storage and utilization. Machine Learning, 10(3), 249-278.

Watkins & Dayan (1992)
Watkins, C. J. & Dayan, P. (1992). Technical note: Q-learning. Machine Learning, 8(3-4), 279-292.



Chris Drummond
Thursday January 31 01:30:31 EST 2002