15-494/694 Cognitive Robotics Lab 8: Cozmo and the GPU
I. Software Update, SDK Update, and Initial Setup
Note: You can do this lab/homework assignment either
individually, or in teams of two.
At the beginning of every lab you should update your copy of the
cozmo-tools package. Do this:
$ cd ~/cozmo-tools
$ git pull
II. Loading and Saving a Network Trained With the MNIST Dataset
- Make a lab8 directory.
- Download the
files mnist3.py and
load_mnist3.py into your lab8
directory.
- Skim the mnist3.py source code; it's slightly different than
the version we used previously. With this version, you must call
train() to train the network. At the end of training, it saves
the weights in a file called mnist3-saved.pt.
- Run the model by typing "python3 -i mnist3.py". The "-i"
switch tells python not to exit after running the program.
After 15 epochs of training it will save the weights.
- If you want to see some of the trained kernels, you can
type display() after training finishes.
- Skim the code for load_mnist3.py. Note that this code has
changed slightly since the lab on Friday; be sure to grab the
latest copy.
- Run load_mnist3.py and observe that it loads the saved weights,
and the reconstituted network classifies the training instance
correctly.
III. Capturing Images from Cozmo
- Open cozmo_fsm/program.py in a text editor and search for
"user_image". This function is automatically called by the
StateMachineProgram machinery on every camera image received from
the robot. The first argument is the raw 3 channel RGB image; the
second argument is a single channel grayscale image.
- Copy and run the file Lab8.fsm. This
program captures one image from Cozmo's camera and displays it using
matplotlib.
IV. Homework: on Cozmo Digit Recognition Using the GPU
- By combining code from load_mnist3.py and Lab8.fsm you can
write a Cozmo behavior that captures camera images and does digit
recognition.
- You will need to resize the camera image to 28x28 in order to
fit the neural network's input requirements. See
this
web page for help on resizing an image using the cv2.resize()
method from OpenCV.
- Since the original image is 320x240, which is not square, you
can't just resize it to 28x28 because that will introduce
distortion.
- Another issue is that all the data used to train the neural net
was normalized: each digit was scaled to a uniform size and centered
in the image. But if you're holding up a sheet of paper to Cozmo,
the digit will vary in size based on distance, and may not be
centered. Therefore, you will need to write some code to find the
bounding box of the digit, allow for a bit of white space around it,
and rescale the resulting region to 28x28 so it looks like the
training data. You can assume that the input is well-formed, i.e.,
there is a single digit on a white background. But your code must
work on real grayscale images from Cozmo's camera, so when finding
the bounding box it cannot assume that the background pixels are
perfectly white, or that there is no noise in the image.
- Your code should take one camera image per second, normalize it,
run it through the neural network, and display the classification
result on the console.
Hand In
Collect your fsm and py files into a zip file and hand it in via Autolab.
|