Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction1EECS, UC Berkeley
2 Universidad de Zaragoza, Spain
3 Universidad de los Andes, Colombia
Abstract
Extracting 3D shape of deforming objects in monocular videos, a task known as
non-rigid structure-from-motion (NRSfM), has so far been studied only on synthetic
datasets and controlled environments. Typically, the objects to reconstruct
are pre-segmented, they exhibit limited rotations and occlusions, or full-length
trajectories are assumed. In order to integrate NRSfM into current video analysis
pipelines, one needs to consider as input realistic -thus incomplete- tracking,
and perform spatio-temporal grouping to segment the objects from their surroundings.
Furthermore, NRSfM needs to be robust to noise in both segmentation and
tracking, e.g., drifting, segmentation ``leaking", optical flow ``bleeding" etc. In
this paper, we make a first attempt towards this goal, and propose a method that
combines dense optical flow tracking, motion trajectory clustering and NRSfM
for 3D reconstruction of objects in videos. For each trajectory cluster, we compute
multiple reconstructions by minimizing the reprojection error and the rank
of the 3D shape under different rank bounds of the trajectory matrix. We show
that dense 3D shape is extracted and trajectories are completed across occlusions
and low textured regions, even under mild relative motion between the object and
the camera. We achieve competitive results on a public NRSfM benchmark while
using fixed parameters across all sequences and handling incomplete trajectories,
in contrast to existing approaches. We further test our approach on popular video
segmentation datasets. To the best of our knowledge, our method is the first to
extract dense object models from realistic videos, such as those found in Youtube
or Hollywood movies, without object-specific priors.
The synthetic face dataset used in the paper is attributed to Garg et. al and can be found here. |
Last update: Nov, 2014.