16-745: Optimal Control and Reinforcement Learning: Course Description


This course surveys the use of optimization to design behavior. We will explore ways to represent policies including hand-designed parametric functions, basis functions, tables, and trajectory libraries. We will also explore algorithms to create policies including parameter optimization and trajectory optimization (first and second order gradient methods, sequential quadratic programming, random search methods, evolutionary algorithms, etc.). We will discuss how to handle the discrepancy between models used to create policies and the actual system being controlled (evaluation and robustness issues). The course will combine lectures, student-presented material, and projects. The goal of this course will be to help participants find the most effective methods for their problems.


Why teach an optimal control course at the Robotics Institute at CMU?

Progress in computer animation based on dynamic optimization has demonstrated solutions to problems we were not able to solve in the past. We are getting closer to practical use of optimal control and reinforcement learning for animation and robot planning.

New hardware such as cost effective supercomputer clusters and thousand core GPU/CPUs also help to make optimal control and reinforcement learning practical.

CMU has been a leader in applying optimal control to animation and robotics. We honor Andy Witkin (1952-2010) for his contributions in applying trajectory optimization to computer animation. Andy was a professor at CMU before moving to Pixar.
(Pixar info)
(CGW writeup)
(IEEE Computer Graphics and Applications writeup)
Andrew Witkin and Michael Kass. Spacetime constraints. Computer Graphics, 22:159-168, 1988. Proc. Siggraph '88: Text Figures


Topics

We will focus on systems with continuous states and continuous actions. We will focus on deterministic systems and systems with "mild" stochastic effects (most of the time any additive noise distribution is assumed to be unimodal or Gaussian, and there are few discontinuities). We will talk about both continuous and discrete time systems.

Optional topics: Based on class interest, we will pick from this list and topics suggested by participants:


Prerequisites

Graduate standing or permission of the instructor. It is necessary to be able to write programs written in some computer language, or use a package such as Matlab. It is helpful to have had prior exposure to numerical methods.


Textbook

There is no textbook for this course. See the resources listed above.


Work


Relevant Books

There is no one reading with the coverage I want, but here are some relevant readings.

Dynamic Optimization, Arthur E. Bryson, 1999
Matlab code

Practical Methods for Optimal Control and Estimation Using Nonlinear Programming, Second Edition (Advances in Design and Control), John T. Betts, 2009

Dynamic Programming and Optimal Control, Dimitri P. Bertsekas, Vol. I, 3RD EDITION, 2005, Vol. II, 3RD EDITION, 2007.


Relevant Courses



Stuff From Economics

Worked Examples in Dynamic Optimization: Analytic and Numeric Methods

Dynamic Optimization in Continuous-Time Economic Models (A Guide for the Perplexed)

Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management (Advanced Textbooks in Economics)

Elements of Dynamic Optimization (also see Amazon.com page)

Econ 610: Stochastic Dynamic Optimization in Economics

Lancaster U. Management School: PhD Course - Foundations of Optimization


Other Resources

Dynamic Programming and Optimal Control/Approximate Dynamic Programming, Dimitri P. Bertsekas

B. van Roy: See Dynamic Optimization publications

DOTcvpSB, a software toolbox for dynamic optimization in systems biology

Optimal control (wikipedia) - too narrow

Optimal control (scholarpedia) also too narrow