CMU RI 16-711: KDC: Assignment 5


This assignment explores designing a controller and a global policy for an unstable system (cart-pole).


Here is a simulation of an inverted pendulum balanced on a cart. Your goal is to develop controllers for this system that get and keep the pole pointed upwards. Use the useful and lib directories from a previous assignment.


Part 1: Try to design a controller by hand for this system. The point of this part of the assignment is to make you appreciate automatic controller design methods. If you think this is too easy, manually design a controller for a jointed pole on a cart, a flexible pole, two unequal length poles on the same cart, or a system with a 0.2 second delay in responding to commands.


Part 2: Design an LQR-based controller for this system. You must choose the optimization criterion. Find the LQR-based controller with the biggest volume of initial conditions in state space for which it works. You can model the volume with a grid, or try to estimate the volume with an ellipsoid. Test what happens when the optimization criterion has a very small penalty on the pole angle vs. a very small penalty on the cart position. Why is there a difference.


Part 3: Design a global policy that can swing the pendulum up to the top using dynamic programming. The criterion you should use is a pure quadratic criterion x'*Q*x + u'*R*u where x is the state vector, u is the action vector, and Q and R are identity matrices. What volume of intial conditions in state space where the LQR controller for the same criterion has a value function that is less than twice the value function of the optimal policy?


Part 4: Just for fun: Let's measure how humans do this task, and try to identify what the human controller is. Software to be provided.


Things to think about: Does a longer or shorter pole make this easier or harder? Does viscous friction help? Does Coulomb friction help? How handle static friction?

Another thing to think about: How do you get LQR control design to generate an integral gain?


References

http://www.coneural.org/florian/papers/05_cart_pole.pdf
http://mil.engr.utk.edu/wiki/Cart-Pole_Dynamics_Testbed_and_Analysis
http://mil.engr.utk.edu/wiki/Example_Cart-Pole_Controllers
http://65.44.200.132/Library/1996/Correction_Cart-Pole.pdf
http://www.cs.ualberta.ca/~sutton/book/code/pole.c
http://brain.cc.kogakuin.ac.jp/~kanamaru/NN/CPRL/
http://portal.acm.org/citation.cfm?id=869873
http://www.cmap.polytechnique.fr/~munos/variable/cartpole.html
http://www-clmc.usc.edu/Resources/Publications?id=2654
http://www-clmc.usc.edu/publications/S/schaal-NIPS1997.pdf
http://www.stanford.edu/class/cs229/ps/ps4/q6/control.m
http://www-anw.cs.umass.edu/rlr/domains.html
http://mlg.eng.cam.ac.uk/marc/learn_ctrl.php
http://www.serbi.ula.ve/sira/Congresos%20Internacionales/con1994_5.pdf
http://www.ict.swin.edu.au/personal/jbrownlee/2005/TR07-2005.pdf
http://www.ijee.dit.ie/articles/Vol18-6/IJEE1333.pdf
http://inst.eecs.berkeley.edu/~ee128/fa08/labs/EECS128_lab5a.pdf
http://www.mitpressjournals.org/doi/abs/10.1162/0899766053011528
http://www.cs.cmu.edu/~sandholm/cs15-381/hw4/index.html


What to turn in?

Generate a web page describing what you did. Include links to your source and compiled code in either .zip, .tar, or .tar.gz format. Be sure to list the names of all the members of your group. Mail the URL of your web page to sanan@cmu.xxx and cc cga@cmu.xxx [You complete the address, we are trying to avoid spam.] The writeup is more important than the code. What did you do? Why did it work? What didn't work?


You can use any type of computer/OS/language you want. You can work in groups or alone.


Questions