3D Scene Analysis
People
Martial Hebert
Takeo Kanade
Description
We study the problem of generating plausible interpretations of a scene from a collection of line segments automatically extracted from a single indoor image. We show that we can recognize the three dimensional structure of the interior of a building, even in the presence of occluding objects. Several physically valid structure hypotheses are proposed by geometric reasoning and verified to find the best fitting model to line segments, which is then converted to a full 3D model. Our experiments demonstrate that our structure recovery from line segments is comparable with methods using full image appearance. Our approach shows how a set of rules describing geometric constraints between groups of segments can be used to prune scene interpretation hypotheses and to generate the most plausible interpretation.
It is easy for us to recognize the structure from a collection of line segments in the image in top right, as well as locate a few doors. However, automatic recognition of structure from a collection of line segments is challenging, as not all lines defining the building structure are perfectly detected by low level image processing. To further complicate the problem, extra edges may lie on surfaces of walls or even on objects that are not part of the target structure. We can still interpret the collection of line segments because 1) we perform geometric reasoning and only consider physically plausible interpretations, 2) we have the ability to look globally at the overall structure, and 3) we have prior knowledge on how the world, in our case the interior of a building, is structured.
Geometric Rules on Corners | Sample Building Models |
Sample of Structure Hypotheses |
Recovered 3D Building Structure |
Objects in Scene |
Video of Results
Scene Analysis. [WMV 5MB]References
D. C. Lee, M. Hebert, and T. Kanade, Geometric Reasoning for Single Image Structure Recovery.IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), June, 2009.Funding
This research is supported by:
- NSF Grant EEEC-0540865