• Sorted by Date • Classified by Publication Type • Classified by Research Category •
Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos, and Pinar Duygulu. Automatic Multimedia Cross-modal Correlation Discovery.
In Proceedings of the 10th ACM SIGKDD Conference, 2004.
Seatle, WA, August 22-25, 2004
[PDF]135.4kB [gzipped postscript]468.1kB
Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find
correlations across the media in a collection of multimedia objects like video clips, with colors, and/or motion, and/or audio,
and/or text scripts. We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations.
Our "MMG"
method requires no tuning, no clustering, no user-determined constants; it can be applied to any multimedia collection,
as long as we have a similarity function for each medium; and it scales linearly with the database size. We report auto-captioning
experiments on the ``standard'' Corel image database of 680 MB, where it outperforms domain specific, fine-tuned methods by
up to 10 percentage points in captioning accuracy (50\% relative improvement).
@InProceedings{KDD04CrossModalCorrelation, author = {Jia-Yu Pan and Hyung-Jeong Yang and Christos Faloutsos and Pinar Duygulu}, title = {Automatic Multimedia Cross-modal Correlation Discovery}, booktitle = {Proceedings of the 10th ACM SIGKDD Conference}, year = 2004, wwwnote = {Seatle, WA, August 22-25, 2004}, abstract = {Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find correlations across the media in a collection of multimedia objects like video clips, with colors, and/or motion, and/or audio, and/or text scripts. We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations. <br> Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to <i>any</i> multimedia collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. We report auto-captioning experiments on the ``standard'' Corel image database of 680 MB, where it outperforms domain specific, fine-tuned methods by up to 10 percentage points in captioning accuracy (50\% relative improvement). }, bib2html_pubtype = {Refereed Conference}, bib2html_rescat = {Multimedia Data Mining}, }
Generated by bib2html (written by Patrick Riley ) on Wed Sep 01, 2004 13:24:30