Daniel Munoz | Onboard Scene Parsing

Description

Rich scene understanding from 3-D point clouds is a challenging task that requires contextual reasoning, which is typically computationally expensive. The task is further complicated when we expect the scene analysis algorithm to also efficiently handle data that is continuously streamed from a sensor on a mobile robot. Hence, we are typically forced to make a choice between 1) using a precise representation of the scene at the cost of speed, or 2) making fast, though inaccurate, approximations at the cost of increased misclassifications. In this work, we demonstrate that we can achieve the best of both worlds by using an efficient and simple representation of the scene in conjunction with recent developments in structured prediction in order to obtain both efficient and state-of-the-art classifications. Furthermore, this efficient scene representation naturally handles streaming data and provides a 300% to 500% speedup over more precise representations.

Videos

Datasets

CamVid: [ground truth]
NYUScenes: [images] [ground truth]
MPI-VehicleScenes: [images] [ground truth]

Results from Miksik et al. ICRA'13 paper

Per-frame classifications: [CamVid 05VD] [CamVid 16E5] [NYUScenes] [MPI-VehicleScenes]
Temporally smoothed classifications: [CamVid 05VD] [CamVid 16E5] [NYUScenes] [MPI-VehicleScenes]
File formats:

*_seg.txt: An M-by-N matrix with values [0,S-1], which maps each pixel to one of S segments.
*_infer.dat: A S-by-K matrix where each row is the K-class probability distribution for the segment.
(Unfortunately the *_infer.dat files are not available for the temporally smoothed classifications.)
*_infer.txt: An M-by-N matrix with values [0,K-1], which maps each pixel to the class with highest probability.

The names and colors of each class index: [CamVid] [NYUScenes] [MPI-VehicleScenes]

Presentations

ICRA 2013 (Hu) slides: [pptx] [pdf]
ICRA 2013 (Miksik) talk: [pptx] [pdf]
ICRA 2009 talk: [pptx] [pdf]

References

	Efficient 3-D Scene Analysis from Streaming Data H. Hu, D. Munoz, J. A. Bagnell, M. Hebert ICRA 2013 [pdf] [supplementary video] [project page] [bibtex]

	Efficient Temporal Consistency for Streaming Video Scene Analysis O. Miksik, D. Munoz, J. A. Bagnell, M. Hebert ICRA 2013 [pdf] [supplementary video] [project page] [bibtex]

	Onboard Contextual Classification of 3-D Point Clouds with Learned High-order Markov Random Fields D. Munoz, N. Vandapel, M. Hebert ICRA 2009 [pdf] [project page] [bibtex]