Adam Berger's Home Page

Adam Berger

Research interests: The application of information theory to real-world problems involving text. Specifically, my thesis work focused on applying information theory to problems in information retrieval, including the ranking, classification, translation and summarization of documents. Other interests include machine learning, statistical inference, data mining, speech processing, coding theory.

Selected publications (including PhD thesis) are here.

Projects:

The StART project: Statistical methods applied to retrieval technology. Looking at real-world problems in information retrieval, such as classification, retrieval and summarization of documents, from a statistical perspective.
Statistical Machine Translation: I was a member of the Candide group at the IBM Watson Research Center for several years in the early 1990's. The group's mission was to explore the possibilities of fully automatic translation from one language (say, French) to another (English) via computer, by allowing a computer to inspect a large collection of translated data and, from the collection, "learn" how to translate. John Lafferty and I have revived and extended that work in the Weaver project here at CMU.
Language Modelling: Predicting the next word in a sequence of English text, and related questions. Language models form an integral part of commercial speech and handwriting recognition systems, and have recently been put to use in document retrieval systems.
Maximum Entropy and Exponential Models: Almost all of the work I do on language modelling falls into the maxent/minimum divergence framework. This page contains information mostly of a tutorial nature on the use of discrete exponential models in natural language processing.