Abstract:
Web image retrieval is a challenging task that requires efforts from image processing, link structure analysis, and web text retrieval. Since image processing is considered very difficult, most current large-scale web image search engines exploit text and link structure to "understand" the content of the image. However, local text information, such as captions, filenames, and adjacent text, is not always reliable and informative, and therefore global information should be taken into consideration when a web image retrieval system makes relevance judgments. We propose a re-ranking method to improve web image retrieval by reordering the images retrieved from an large-scale web image search engine. The re-ranking process is based on a relevance model, which is a probability model that evaluates the relevance of the HTML document linking to the image, and assigns a relevance probability of the image. The experiment results showed that the re-ranked image retrieval achieved better performance than original retrieval, suggesting the effectiveness of the re-ranking method. The relevance model is learned from the Internet without preparing any training data and independent of the underlying ranking algorithm of image search engines. The re-ranking process should be applicable to any image search engine with little effort. |
Related Readings:
VisualSEEK: a fully automated content-based image query system J.R. Smith, S.F. Chang In Proceedings of ACM Multimedia 1996
Authoritative Sources in a Hyperlinked Environment
Relevance-Based Language Models
The Anatomy of a Large-Scale Hypertextual Web Search Engine |