Description
Many interpretation problems in computer vision can be viewed as a label assignment problem. Two example results from our work illustrated in the figure above: in object segmentation from 3-D point clouds (left) we wish to assign the object label (building, vegetation, etc.) to each point, and for estimating 3-D geometric surfaces from a 2-D image (right), we wish to assign the surface type (ground, vertical structure, sky) to each pixel. One way to approach these problems is with structured prediction, i.e. we try to model the structure of the present dependencies in the predictions.
One popular framework to learn this context is with Conditional Random Fields (CRFs). While most CRFs model pairwise relations, e.g. pairs of neighboring pixels, recent work from Kohli et al. have demonstrated the benefit of using high-order models. With high-order models, in addition to using local features per pixel (e.g. pixel color), we can incorporate into the model richer features over large regions (e.g. shape) in a principled manner. As illustrated in the figure below, these regions can come from any segmentation algorithm such as mean-shift or k-means.
Unique from work of Kohli et al., in this work we learn effective high-order models from training data. Based on optimization techniques based from Ratliff et al., we demonstrate how to train high-order, non-parametric models and report better performance with these non-parametric models over previously used parametric models on the two distinct applications of 3-D point cloud classification and Geometric Surface Context. The procedure is simple to implement, scales to large datasets, features, labels, and is discussed in-depth in our CVPR 2009 paper.
Below we compare some classification results between parametric (left) and non-parametric (right) high-order models.
-
3-D Point Cloud Classification. Note the boundaries of the Ground & Vegetation are much
better preserved with the non-parametric model, resulting in a cleaner segmentation.
Colors: green = Veg., orange = Ground, red = Wall, sky-blue = Wire, dark-blue = Pole/Trunk
-
Geometric Surface Context. The non-parametric model rectifies severe errors
made by the parametric model and improves overall accuracy by 13 percentage points.
Colors: green = Ground, red = Vertical Structure, purple = Sky
Datasets
Code
Download an extended implementation (C++, Modified BSD license) of this work, for training (non-linear, Robust) Pott's potentials, over arbitrary-sized cliques: [code] [readme.txt]
Presentation
References
Contextual Classification with Functional Max-Margin Markov Networks CVPR 2009 [pdf] [project page] [bibtex] |
|
Onboard Contextual Classification of 3-D Point Clouds with Learned High-order Markov Random Fields ICRA 2009 [pdf] [project page] [bibtex] |
|
On Two Methods for Semi-Supervised Structured Prediction Tech. Report CMU-RI-TR-10-02, Robotics Institute, 2010 [pdf] [project page] [bibtex] |
Funding
- Siebel Scholarship
- Collaborative Technology Alliance Program, Cooperative Agreement DAAD19-01-209912