15-494/694 Floor-Detecting Occupancy Grid Project
This project will create an occupancy grid representation of the
environment by learning to recognize empty floor space. Any
area that doesn't look like a patch of floor will be treated as
an obstacle. An occupancy grid representation can be used to
drive exploration behavior by tracking which parts of the
environment have not yet been observed.
The Cozmo SDK maintains an occupancy map representation as a
quadtree. The cozmo-tools worldmap viewer will display this map
when the "m" key is pressed. The drawback of this occupancy map
is that it's based on recognizing known objects (cubes and
charger), plus cliff edges from Cozmo's highly unreliable IR
cliff detector. We want to take a different approach:
recognizing empty floor space by learning the appearance of the
floor.
Exploring the built-in occupancy map:
- Run simple_cli and do "show worldmap_viewer".
- Press "m" in the worldmap viewer window to toggle display
of the map. The meanings of the colors are given in
worldmap_viewer.py.
- Type "show particle_viewer" to bring up the particle
viewer window.
- You can use the particle viewer's keyboard commands (such
as w/a/s/d) to drive the robot around and observe how the
occupancy map is upgraded. Type "h" in the particle viewer window
for a complete list of keyboard commands.
Recognizing the floor:
- The approach we're going to take is to learn patches of
floor, and then compare those patches to regions in the
current camera image to classify pixels as "floor" or
"non-floor".
- Since the "floor" might have texture, such as a
wood-grained table top, we want to use patches rather than
trying to model the floor as a single RGB value.
- Use of color should be very helpful. To enable color camera
images do
robot.camera.color_images_enabled=True .
Training the model: for each camera image, grab a 9x9
near the bottom of the image. Drive the robot around and collect
multiple patches.
Testing the classifier: for each pixel in the camera image, take
a 9x9 patch around that pixel and compare it to the stored floor
patches using sum-squared-error. If the error of the best
matching patch is low enough, classify that pixel as "floor".
Experiment to see how well this algorithm works. Is 9x9 the
best patch size to use? How many patches are needed for
accurate floor detection? How does lighting affect the results?
Does the system work better on textured surfaces?
Next steps: we need a way to map from pixel position to position
in the world. This is a function of the robot's head angle.
The method robot.kine.project_to_ground(cx,cy)
performs this function.
Where is the obstacle? If a pixel doesn't look like floor,
where is that surface in space? It could be a thin object lying
on the floor, or it could be a larger object sticking up and
occluding our view of more distant spots on the floor. Without
a way to measure depth, we can't tell. Thus, the only safe
places where obstacles can be inferred are at the bottom edge
between "floor" and "non-floor" regions.
|