Using Function Optimization To Find Policies: Ball Run
This assignment explores using function optimization to do
policy optimization.
The problem used is "ball run" also known as "marble run" or "bubble ball".
Part 1
Please install the free bubble ball app on your phone and start
playing it.
Wikipedia,
Apple,
Android
Part 2
Get or make a simulation the will support balls falling in air, bouncing,
and rolling/sliding down a surface.
I have written a simulation using Open Dynamics Engine.
ODE simulation
An alternative simulator is Box2D.
Box2D information
Part 3
Make an optimizer that learns to move the ball as far to the right as possible
with one obstacle bar 0.1m long. Implement several optimization approaches,
including a gradient-based approach, CMA-ES, and another non-derivative-based
approach. DON'T write the code yourself! Use a library, such as
Matlab or the Gnu Scientific Library, I include an old version of the
Numerical Recipes library (2nd Edition) in the ODE simulation above.
You can get
CMA-ES code from Hanson's CMA-ES web site.
Google "public domain optimization libraries" for more.
Part 4
A vision system sees obstacles and a goal at obstacles.txt in pixel coordinates (so positive Y is down, you should fix that).
The ball is dropped at (431, 181) in pixel coordinates.
The obstacles are really 0.25m long and 0.037m high, so you also need
to convert the pixel values to meters. Use optimization to find a
simulated setup that is "similar" that gets the ball in the goal.
A video (slow motion) of the actual ball on this run.
Part 5
For the system in part 4, an observed trial has the following ball
trajectory: trajectory.txt, sampled at
30 frames per second in pixel coordinates.
Use optimization to adjust the parameters of the simulation so that
the simulated trajectory matches the observed trajectory,
with the obstacles back in the observed positions.
Parameters to change might include gravity, air resistance,
something to do with rolling vs. sliding (friction, moment of inertia
of the ball, some parameters you make up, ...), and in the ODE
simulation the bounce parameters in dynamics.cpp:
else if ( c_type == OBSTACLE )
{
contact.surface.mode = dContactBounce | dContactSoftCFM;
// friction parameter
contact.surface.mu = 1.0;
// bounce is the amount of "bouncyness".
contact.surface.bounce = 0.3; // used to be 0.1
// bounce_vel is the minimum incoming velocity to cause a bounce
contact.surface.bounce_vel = 0.01;
// constraint force mixing parameter
contact.surface.soft_cfm = 0.01; // 0.001 in bounce
}
Part 6
Construct an interesting simulated ball run with four obstacles,
and send Chris the obstacle locations (in metric coordinates).
We will run it in reality, and send you back a video and ball trajectory
that actually happened. You then modify the obstacle locations, and
we repeat, until the desired behavior happens in reality.
Part 7
We will construct a challenge for you similar to what you have to
do in the bubble ball app.
What to turn in?
You can work in groups or alone.
Generate a web page describing what you did (one per group).
Include links to your source and any compiled code in either .zip, .tar, or
.tar.gz format.
Be sure to list the names of all the members of your group.
Mail the URL of your web page to cga@cmu.edu.
The writeup is more important than the code. What did you do? Why
did it work? What didn't work and why?
Questions