VSAM Testbed System
We have built a VSAM testbed system
to demonstrate how automated video understanding technology enables a single
human operator to monitor a wide area. The testbed system consists
of multiple sensors distributed across the campus of CMU, tied to
a control room located in the Planetary Robotics Building (PRB). The testbed
consists of a central operator control unit (OCU) which receives video
and Ethernet data from multiple remote sensor processing units (SPUs).
The OCU is responsible for integrating symbolic object trajectory information
accumulated by each of the SPUs together with a 3D geometric site model,
and presenting the results to the user on a map-based graphical user interface
(GUI). Each logical component of the testbed system architecture
is described briefly below.
Sensor Processing
Units (SPUs)
The SPU acts as an intelligent filter
between a camera and the VSAM network. Its function is to analyze
incoming video streams for the presence of significant entities or events,
and to transmit the results symbolically to the OCU. This arrangement
allows for many different sensor modalities to be seamlessly integrated
into the system. By performing as much video processing as possible
on the SPU, bandwidth requirements of the VSAM network are reduced. Full
video signals do not need to be transmitted; only symbolic data extracted
from video signals.
Logically, each SPU combines a camera
with a local computer that processes the incoming video. Many types
of sensors and SPUs have been incorporated into the VSAM IFD testbed
system: a) color camera with active pan, tilt and zoom control; b) thermal
sensor; c) a relocatable van; and d) an airborne sensor. In addition,
two sensors from other groups have been successfully integrated: e) a Columbia-Lehigh
omnicamera; f) a Texas Instruments indoor activity monitoring system. By
using a pre-specified communication protocol, these systems were
able to directly interface with the VSAM network
|
|
The relocatable van and airborne
SPU warrant further discussion. The relocatable van SPU consists of a sensor
and pan-tilt head mounted on a small tripod that can be placed on the vehicle
roof when stationary. All video processing is performed on-board
the vehicle, and results from object detection and tracking are assembled
into symbolic data packets and transmitted back to the operator control
workstation using a radio Ethernet connection. The major research
issue involved in demonstrating the redeployable van involves rapid calibration
of sensor pose after redeployment, so that object detection and tracking
results can be integrated into the VSAM network (via computation
of geolocation) for display at the operator control console.
The airborne sensor and computation
packages are mounted on a Britten-Norman Islander twin-engine aircraft
operated by the U.S. Army Night Vision and Electronic Sensors Directorate.
The Islander is equipped with a FLIR Systems Ultra-3000 turret that has
two degrees of freedom (pan/tilt), a Global Positioning System (GPS) for
measuring position, and an Attitude Heading Reference System (AHRS) for
measuring orientation. The continual self-motion of the aircraft
introduces challenging video understanding issues. For this reason,
video processing is performed using the Pyramid Vision Technologies PVT-200,
a specially designed video processing engine.
Operator Control Unit
(OCU)
The VSAM OCU accepts video processing
results from each of the SPUs and integrates the information with a site
model and a database of known objects to infer activities that are of interest
to the user. This data is sent to the GUI and other visualization
tools as output from the system. One key piece of system functionality
provided by the OCU is sensor arbitration. Care must be taken to ensure
that an outdoor surveillance system does not underutilize its limited sensor
assets. Sensors must be allocated to surveillance tasks in such a way that
all user-specified tasks get performed, and, if enough sensors are present,
multiple sensors are assigned to track important objects. The system
performs a greedy optimization based on a tasking cost function to determine
the best combination of SPU tasking to maximize overall system performance
requirements.
The OCU also contains a site
model representing VSAM-relevant information about the area being monitored.
This includes both geometric and photometric information about the scene,
represented using a combination of image and symbolic data.
The OCU uses the site model to support a) object geolocation via intersection
of viewing rays with the terrain, b) visibility analysis (predicting what
portions of the scene are visible from what sensors) so that sensors can
be efficiently tasked, and c) specification of the geometric location and
extent of relevant scene features. For example, we might directly task
a sensor to monitor the door of a building, or to look for vehicles passing
through a particular intersection.
Graphical User Interface
(GUI)
One of the technical goals of the VSAM
IFD effort is to demonstrate that a single human operator can effectively
monitor a significant area of interest. Keeping track of multiple
people, vehicles, and their interactions, within a complex urban environment
is a difficult task. The user obviously shouldn't be looking at dozens
of screens showing raw video output. That amount of sensory overload
virtually guarantees that information will be ignored, and requires a prohibitive
amount of transmission bandwidth. Our approach is to provide an interactive,
graphical user interface (GUI) that uses VSAM technology to automatically
place dynamic agents representing people and vehicles into a synthetic
view of the environment. This approach has the benefit that visualization
of scene events is no longer tied to the original resolution and viewpoint
of a single video sensor. The GUI currently consists of a map of the area,
overlaid with all object locations, sensor platform locations, and sensor
fields of view. In addition, a low-bandwidth, compressed video
stream from one of the sensors can be selected for real-time display. The
GUI is also used for sensor suite tasking. Through this interface,
the operator can task individual sensor units, as well as the entire testbed
sensor suite, to perform surveillance operations such as generating a quick
summary of all object activities in the area.
Communication
The nominal architecture for the VSAM
network allows multiple OCUs to be linked together, each controlling multiple
SPUs. Each OCU supports exactly one GUI through which all user-related
command and control information is passed. Data dissemination is not limited
to a single user interfacer -- it is also accessible through a series of
visualization nodes (VIS).
There are two independent communication
protocols and packet structures supported in this architecture: the Carnegie
Mellon University Packet Architecture (CMUPA) and the Distributed Interactive
Simulation (DIS) protocols. The CMUPA is designed to be a low bandwidth,
highly flexible architecture in which relevant VSAM information can be
compactly packaged without redundant overhead. All communication
between SPUs, OCUs and GUIs is CMUPA compatible. The CMUPA protocol specification
document and code can be downloaded.
VIS nodes are designed to distribute
the output of the VSAM network to where it is needed. They provide symbolic
representations of detected activities overlaid on maps or imagery. Information
flow to VIS nodes is unidirectional, originating from an OCU. All
of this communication uses the DIS protocol, developed and used by the
Distributed Simulation community. An important benefit to keeping VIS nodes
DIS compatible is that it allows us to easily interface with synthetic
environment visualization tools such as ModSAF and ModStealth. See the
section on VSAM visualization.
Current Testbed
Infrastructure
As of Fall 1999, the VSAM IFD testbed
system on the campus of Carnegie Mellon University consisted of 14 cameras
distributed throughout campus. All cameras are connected to the VSAM
Operator Control Room in the Planetary Robotics Building (PRB): ten are
connected via fiber optic lines, three on PRB are wired directly to the
SPU computers, and one is a portable Small Unit Operations (SUO) unit connected
via wireless Ethernet to the VSAM OCU. The work done for VSAM 99
concentrated on increasing the density of sensors in the Wean/PRB area.
The overlapping fields of view in this area of campus enable us to conduct
experiments in wide baseline stereo, object fusion, sensor cuing and sensor
handoff.
The backbone of the
CMU campus VSAM system consists of six Sony EVI-370 color zoom cameras
installed on PRB, Smith Hall, Newell-Simon Hall, Wean Hall, Roberts Hall,
and Porter Hall. All have active pan, tilt and zoom control. Five of these
units are mounted on Directed Perception pan/tilt heads, one is on a Sagebrush
Technologies pan/tilt head. Two stationary fixed-FOV color cameras mounted
on the peak of PRB facilitate work on activity analysis, classification,
and sensor cuing. Three stationary fixed-FOV monochrome cameras mounted
on the roof of Wean Hall are connected to the Operator Control Room over
a single multimode fiber using a video multiplexor. A Raytheon NightSight
PalmIR thermal (FLIR) sensor can also be mounted on Wean. A portable sensor
unit was built to allow further software development and research at CMU
in support of the DARPA Small Unit Operations (SUO) program. This unit
consists of the same hardware as SPUs that were delivered to Fort Benning,
Georgia in 1999. |
|
The Operator Control Room in PRB
houses the SPU, OCU, GUI and development workstations -- nineteen computers
in total. The four most recent SPUs are Pentium III 550 MHz computers.
Dagwood, a single ``compound SPU'', is a quad Xeon 550 MHz processor computer,
purchased to conduct research on classification, activity analysis, and
digitization of three simultaneous video streams. Also included in this
list of machines is a Silicon Graphics Origin 200, used to develop
video database storage and retrieval algorithms as well as designing user
interfaces for handling VSAM video data.
|
|
Two auto tracking Leica theodolites
(TPS1100) are installed on the corner of PRB, and are hardwired to a data
processing computer linked to the VSAM OCU. This system allows us to do
real-time automatic tracking of objects to obtain ground truth for evaluating
the VSAM geolocation and sensor fusion algorithms. This data can
be displayed in real-time on the VSAM GUI. An Office of Naval Research
DURIP grant provided funds for two Raytheon NightSight thermal sensors,
the Quad Xeon processor computer, the Origin 200, an SGI Infinite
Reality Engine and the Leica theodolite surveying systems.