Task 2: Context-based Tracking


Calibration-free View Augmentation for Semi-Autonomous VSAMs


---
Effective real-time tracking is an indispensable component of our calibration-free view augmentation system. Graphical objects representing targets of interest (usually vehicles in this FRE) are to be overlaid on live video. Targets' positions are represented relative to landmark features that are tracked in the live video stream. Thus both targets and landmarks must be reliably tracked despite occlusions. The general approach is to use an adaptive algorithm that chooses appropriate trackers on the basis of viewing context, target dynamics and geometry, and tracker performance, and to fuse state information arising from context.

The development of robust real-time vehicle tracking systems is a challenging task. Targets have nonlinear dynamical properties, will maneuver, will change in appearance, may be obscured, etc. This research task is to develop context-based tracking: this means that the method and its internal models adapt to the changing surveillance situation. The output of the tracker can be as simple as the image location and as complex as a state estimate for the target (its 3-D location, pose, velocity). The problem involves two sub-areas: (1) 3-D geometric target models and dynamic models for reliability and to aid prediction, and (2) temporal filters and information fusion for smooth tracker output and better target state information.

Under Task 2, we shall pursue tracking techniques that operate as well as possible with no prior information, but which can take advantage of geometric information and dynamic models (of other VSAMs, wheeled and tracked vehicles, people). Such knowledge allows more accurate prediction in the case of obscuration or dropout. Dynamic models allow tracking in car--parameter space (steering angle plus speed magnitude).

rget's pose, or orientation, is important information for an observer and can improve tracker prediction and robustness. Weak (affine) camera models and nonlinear optimization techniques are the state of the art in pose recovery. The former technique is fast, but inaccurate if the target is too deep, too close, or too much off the optical axis. The latter can fail to converge for certain object poses. We have implemented both methods, so far with simulated data only. Lowe's and Kakikura's algorithms is of the optimization variety. Araujo's algorithm is an improvement to Lowe's and Kakikura's algorithm in accuracy and convergence: it relaxes some unnecessary mathematical constraints. We are proposing early on to characterize the conditions under which the affine tracker or one based on Araujo's algorithm is preferable.

Temporal (Kalman) filters can be used adaptively for dealing with the complex dynamics one sees in scenes of maneuvering objects, but we believe adaptive lattice filters have similar properties, and since they adapt their expectations in real time, they are better suited to the prediction of maneuvering trajectories. We shall incorporate modern filtering ideas (like those from Oxford to improve tracker performance.

Approaches tailored to specific geometric and imaging situations are more precise, but since they rely on accurate models and known conditions, they are not robust to changes in model or conditions. This motivates our Task 2 goal: adaptively choosing the best tracker for the current situation , and of using statistical tools to combine estimators. An analogous approach has been suggested, but that work did not address the problems of orientation recovery, nor did it consider the possibility of using independent trackers working on different state--spaces.

Carceroni has implemented the adaptive aspect of lattice filters with recurrent neural nets and the approach is promising. In optional work we shall apply this filter to the Task 2 domain. In another optional task, we propose to follow one of Lowe's original suggestions for an improvement to his algorithm by using known geometry and extracted information (on pose and how pose is changing) to predict the future location and appearance of the complex target.

This work will be developed as follows.

On To Next Task

This page is maintained by Mike Van Wie.

Last update: 11/11/96.