The Depth Mapping Algorithm
The algorithm takes as input a left YUV sketch, a right YUV sketch, and a boolean mask sketch of the left image. The output is a YUV sketch of the masked objects from the left image with the color of each pixel marking the depth of that pixel. The left and right input images should be taken from different points on the y-axis, both facing in the direction of the positive x-axis (on the Chiara, the y-axis is left-right, and the x-axis is forward-backward). In other words, after taking the left picture, the camera is translated a short distance to the right before taking the right picture. The mask is a boolean sketch marking all pixels from the left image that should be depth mapped. Since we only care about some of the objects in the image, masking out all other pixels significantly speeds up the algorithm and eliminates noise from the output.
For every source pixel in the left image, there is a "matching" pixel in the right image, or a pixel in the right image that captures the same point on the object as the pixel from the left image. Since the camera was translated to the right between the two images, a matching pixel from the right image will be in the same row as its source pixel in the left image, but it occupies a different position within the row than the source. In our algorithm we refer to this shift in pixel position as disparity. Objects that are close to the camera will have greater disparity than objects farther from the camera, so by measuring disparity we can determine relative distance.
To calulate disparity, our algorithm iterates over every masked pixel in the left image and finds the best matching pixel in the right image. It does this by specifying a window of pixels, 11 by 11 pixels for example, around the source pixel in the left image, making an equal sized window for all pixels up to the source's position in the same row in the right image, and choosing the right image pixel that has the smallest error compared to the source window. Error is calculated using the sum of squared differences (SSD-Wikipedia page ) between the two windows. The matching pixel will always be to the left of the source pixel, which is why only the pixels in the row up to the source pixel position are considered.
Once the disparity value for a source pixel is calculated, that value is entered into the result sketch in the same pixel position. The result sketch interprets this value as a color. Colors range from bright red to dark blue. Near objects have large disparities and appear as bright red or light blue, while distant objects have small disparities and appear dark blue.
The results page has an example of this algorithm in action. The files page has links to download the source code for this algorithm, along with a sample behavior.