Robotics Institute Seminar, September 28, 2001
Special Time and Place |
Seminar Abstract |
Speaker Biography |
Speaker Appointments
Movies to Geometric 3D Models: the Structure from Motion Problem
John Oliensis
NEC Research Institute
1305 Newell-Simon Hall
Refreshments 3:15 pm
Talk 3:30 pm
I describe some of my recent results on the Structure-from-Motion problem
(SFM). Given a sequence of photographic images of a fixed 3D scene, taken by
a camera at several unknown positions and orientations, the problem is to
recover 1) a 3D geometric model of the scene (structure), 2) the camera's
position and orientation for each image (motion).
One seeks estimates that optimally explain the image data: thus, SFM is an
optimization problem. Formally, the goal is to find the estimate of the
scene and motion minimizing the "error" between the data predicted by the
estimate and the actual image data. To understand the SFM problem---and to
ensure that algorithms avoid false reconstructions---one must understand the
shape of the "error surface," i.e., how the error depends on the estimate.
My recent results include:
An analytic model of the error surface. (This extends the work of
Jepson/Heeger/Maybank and my previous result that the error surface has a
characteristic local minimum.) The model applies to planar and nonplanar
scenes---which is crucial since most 3D scenes are in effect nearly
planar---and shows how the well-known two-fold ambiguity for planar scenes
affects the error surface. Using this model, one can show that the error
surface has no local minima under some conditions.
For sequences of two images, a simple, exact expression for the
error that depends only on the camera positions/orientations. This gives a
fast algorithm, since one can estimate the motion by minimizing the
expression over the motion unknowns, avoiding a time--consuming minimization
over a large number of structure unknowns. Also, I present a solution to the
triangulation problem: a simple, exact expression for the optimal
estimate of the structure given the motion. I also demonstrate a new
ambiguity in recovering the structure by triangulation.
Multi-image algorithms that compute directly from the photographic
image data, without needing to iterate from an initial guess at the
unknowns. If available, this approach can also and simultaneously use data
in the form of 3D points or lines pre-tracked over the sequence, or
measurements of the affine deformations of image patches over time. It is
designed for sequences where the camera makes small movements, e.g.,
hand--held video sequences. It is simple to implement and gives results
superior to those of the Sturm/Triggs algorithm.
After receiving his Ph.D. in theoretical particle physics for research at the University of Chicago and Princeton University, John continued his physics research at the Fermi National Accelerator Laboratory and the Argonne National Laboratory. His interests then shifted to computer vision. In 1988 he joined the University of Massachusetts at Amherst to conduct research in computer vision as a member of the research faculty. He has been with NECI since 1994, where his interests include the reconstruction of object shape from images, the recognition of objects, and human vision.
For appointments, please contact Jianbo Shi (jshi@cs.cmu.edu).