Wednesday, November 10, 2004 - 3:00, WeH 4625
Associating Names with Persons in Broadcast News Video
Speaker: Jun Yang

Searching and identifying various people appearing in broadcast news video leads to better understanding and access of the news video content. This talk addresses two reverse problems related to the people appearing in news video: (1) person finding, which is about locating the appearances of named persons in the video, and (2) person naming, which attempts to label individual persons with their names. The bottleneck of the first problem is the temporal mismatches between people names in the transcript and their visual appearances in the video, which is solved by introducing a timing pattern factor into the text-based IR method. Combining visual features such as facial similarity and anchor classifier also helps improve the performance. The second problem is formulated as a classification problem attacked by a machine learning approach, which exploits a variety of multi-modal features including speaker identification, transcript clues, temporal video structure, etc. High accuracy on person naming has been reported on ABC World News Tonight video in TRECVID 2004 dataset. Moreover, since our approach does not rely on face recognition, it is able to name people that have never been seen before.