Carnegie Mellon University
Computer Science Department

AI Seminar 03/04 Schedule

Time: Tuesday 3:30-4:30pm
Place: Wean Hall 5409

Organizer: Dr. Tuomas Sandholm

Assistant: Phyllis Pomerantz (plp@cs.cmu.edu). Unless a different person is named below, please contact Phyllis for appointments with the speakers.

The AI seminars are open to the public and will be held on most Tuesdays at 3:30pm in Wean Hall, Carnegie Mellon University. Special AI seminars can be arranged for visitors on dates other than Tuesdays. The schedule will be updated daily. To nominate a speaker, contact Dr. Tuomas Sandholm at sandholm@cs.cmu.edu.

All AI seminars will be posted to both cboards and bboards (cs & robotics). Meanwhile, if you would like to/not to be reminded by email, please subscribe/unsubscribe ai-seminar-announcements mailing list by sending email to Michael.Lewicki@cs.cmu.edu. Simply put "Subscribe/Unsubsribe ai-seminar-announcements" in the subject of your message.

Speakers' Guide | Mailing List | Past years' AI seminars: 02 01 00 99 98 97
Other AI-related seminars at CMU: CALD Seminar | RI Seminar | LTI Seminar | CNBC Seminar | VASC Seminar | Machine Learning | PITT ISP Seminar

Date	Speaker	Affiliation	Topic
9/26/03, Wean Hall 7220, 3:30pm Note unusual date & room	Jason Hartline (This talk is part of the theory seminar series, but I highly recommend it to AI seminar series participants also.)	CMU Computer Science Department	Randomization and Optimality in Profit Maximizing Auctions
10/21/03	Ron Lavi	Hebrew University, CS	Towards a Characterization of Truthful Combinatorial Auctions
10/28/03	Drew Bagnell	CMU School of Computer Science	Learning Policies: Leveraging Supervised Learning for Planning and Control
11/18/03	David Blei	UC Berkeley, CS	Learning Topic Hierarchies
11/25/03	Andrew McCallum	UMass Amherst, Computer Science	Conditional Random Fields for Information Extraction and Coreference
2/3/04	Eric Xing	UC Berkeley, CS	Generalized Mean Field Inference in Graphical Models (For appointments, contact Jill Lentz [jlentz@cs.cmu.edu].)
2/10/04	Jonathan Schaeffer	University of Alberta, CS	Dealing with Uncertainty in Poker
2/24/04	Naftali Tishby	School of Computer Science and Engineering and Interdisciplinary Center for Neural Computation, Hebrew University	Efficient Data Representations that Preserve Information
2/27/04, Wean Hall 4623, 3:30pm Note unusual date & room	Michael Kearns	Upenn CIS	Network Models for Game Theory and Economics
3/2/04	Russ Greiner	University of Alberta, CS	WebIC: An Effective ‘Complete-Web’ Recommender System
3/9/04	SPRING BREAK	NO TALK
3/30/04	Henry Kautz	University of Washington, CSE	Toward a Universal Inference Engine
4/20/04	Leslie Pack Kaelbling	MIT Computer Science and Artificial Intelligence Lab	Life-Sized Learning
4/27/04	Michael Lewicki	CMU Center for the Neural Basis of Cognition and Computer Science Department	Probabilistic models for learning structure in natural scenes and sounds
6/8/04	Pat Langley	Stanford U.	Computational Induction of Explanatory Process Models
TBA	Tom Mitchell	CMU School of Computer Science	TBA (Brain science & machine learning)

9/12/03 – Jason Hartline CMU Computer Science Department

Randomization and Optimality in Profit Maximizing Auctions

One feature of the classical treatment of optimization problems is that the space over which the optimization is being performed, i.e., the input description of the problem, is assumed to be publicly known to the optimizer. This assumption does not always accurately represent reality, for example, in applications involving resource sharing and electronic commerce on the Internet where transactions must occur among parties with diverse and selfish interests. In such settings, the inputs to many optimizations being performed are not publicly known in advance. Instead they must be solicited from companies, computerized agents, individuals, etc. that may act selfishly to promote their own self-interests. In particular, they may lie about their values or may not adhere to specified protocols if it benefits them. One of the most fundamental problems where the selfish interests of the participants comes into play is in auctions. In this talk I will review the worst case competitive analysis of auctions for digital goods. This review will include a review of the necessary game theoretic notions, the solution concept of truthful mechanism design, and the framework for competitive analysis of auctions. I will then pose two open problems relating to this work and present what is known about them. The first open problem I will present is on the existence of competitive deterministic auctions. On this topic, I will show that there is no symmetric deterministic auction that is competitive. On the other hand, I will show that there is a competitive auction that uses only 1 random bit. The open question left is whether there exists asymmetric deterministic auctions which are competitive. The second open problem I will present is on the optimal competitive ratio. The best known auction achieves a competitive ratio of 3.39. I will discuss a lower bound on the competitive ratio of 2.42. Furthermore, I will present two simplifications of the auction problem, one for which the optimal competitive ratio is known, and the other for which it is not known.

(This talk presents work that is joint with Amos Fiat, Andrew Goldberg,

Anna Karlin, and Mike Saks)

10/21/03 – Ron Lavi Hebrew University, Computer Science, Israel

Towards a Characterization of Truthful Combinatorial Auctions

We analyze incentive compatible (truthful) mechanisms over restricted domains of preferences, the leading example being combinatorial auctions. Our work generalizes the characterization of Roberts (1979) who showed that truthful mechanisms over unrestricted domains with at least 3 possible outcomes must be "affine maximizers". We show that truthful mechanisms for combinatorial auctions (and related restricted domains) must be "almost affine maximizers" if they also satisfy an additional requirement of "independence of irrelevant alternatives". This requirement is without loss of generality for unrestricted domains as well as for auctions between two players where all goods must be allocated. This implies unconditional results for these cases, including a new proof of Roberts' theorem. The computational implications of this characterization are severe, as reasonable "almost affine maximizers" are shown to be as computationally hard as exact optimization. This implies the near-helplessness of such truthful polynomial-time auctions in all cases where exact optimization is computationally intractable.

See also http://www.cs.huji.ac.il/~tron for more information.

Joint work with Ahuva Mu'alem and Noam Nisan.

10/28/03 – Drew Bagnell CMU, School of Computer Science

Learning Policies: Leveraging Supervised Learning for Planning and Control

Increasingly, learning systems are called upon to not merely predict,

but to take actions that affect the world. The fields of planning and

control are meanwhile asked to make good sequential decisions in

complex problems where simple analytic models are unavailable. This

confluence of learning and control has driven research to identify

algorithms that can search for good strategies while remaining

efficient in both sample and computational resources.

Policy search approaches are becoming a popular alternative to value

function methods in part because they gracefully handle both large

state-spaces and partial observability. I'll discuss two recent

developments that transplant techniques from supervised learning to

learning policies. First, I'll show how notions from information

geometry give simple algorithms with dramatically better performance

than existing reinforcement learning algorithms. Next, I'll discuss a

new approach that discards Bellman equations and searches directly in

a space of policies while retaining the central insight of dynamic

programming. With appropriate domain knowledge, this algorithm is

efficient in both computational and sample complexity and provides

much stronger guarantees than classical value-function approximation

can. Examples will illustrate how each theoretical advance leads to

state-of-the-art empirical performance on difficult benchmark

problems.

J.A. (Drew) Bagnell is a doctoral candidate in Robotics and an NSF

graduate fellow at Carnegie Mellon University. His interests include

rich probabilistic models for statistical learning, robust and

stochastic control, and the application of machine learning to large,

partially observable planning and control problems.

11/18/03 – David Blei University of California, Berkeley, Computer Science Department

Learning Topic Hierarchies

I will discuss the problem of learning topic hierarchies from text data. Model selection in this domain is daunting---which of the large collection of possible trees to use? Taking a Bayesian approach, we have developed the nested Chinese restaurant process, a flexible, nonparametric prior on tree structures which is built via a collection of distributions on partitions of integers. This prior allows arbitrarily large branching factors and readily accommodates growing data collections.

We have used this distribution as a prior in a hierarchical variant of latent Dirichlet allocation (LDA). LDA is a probabilistic graphical model of text collections in which documents can exhibit multiple latent topics. In the hierarchical variant, these topics lie in a hierarchy which itself is random and unknown. A first approach to inference in such a model is a straightforward Gibbs sampling algorithm which finds an approximate posterior distribution on topic hierarchies induced by a collection of documents. This yields interesting results when applied to a dataset of abstracts from the NIPS conferences.

(Joint work with Thomas Griffiths, Michael Jordan, and Josh Tenenbaum)

BIO

David Blei is a doctoral candidate in the Computer Science Department at U.C. Berkeley. He is interested in probabilistic graphical models applied to problems in text-processing and information retrieval.

11/25/03 – Andrew McCallum University of Massachusetts Amherst, Computer Science Depertment

Information Extraction with Conditional Random Fields

Information extraction is the process of filling a structured database
from unstructured text. It is a difficult statistical and
computational problem often involving hundreds of thousands of
variables, complex algorithms, and noisy and sparse data. Recently
there has been significant interest in conditionally-trained
alternatives to joint probabilistic models such as HMMs. In this talk
I will briefly review previous work in finite state, Conditional
Random Field (CRF) models for information extraction, and then
describe four pieces of more recent work: (1) the application of CRFs
to extraction of tables from government reports, (2) feature induction
for these models, applied to named entity extraction, (3) a
random field method for noun co-reference resolution that has strong
ties to graph partitioning, (4) an extension of CRFs to factorial
state representation, enabling simultaneous part-of-speech tagging and
noun-phrase segmentation.

Joint work with colleagues at UMass (Ben Wellner, Khashayar
Rohanimanesh, Charles Sutton, David Pinto, Xing Wei, Wei Li, Bruce Croft),
CMU (John Lafferty), and UPenn (Fernando Pereira).

Andrew McCallum is an Associate Professor at University of
Massachusetts, Amherst. He was previously Vice President of Research
and Development at WhizBang Labs, a company that used machine learning
for information extraction from the Web. In the late 1990's he was a
Research Scientist and Coordinator at Justsystem Pittsburgh Research
Center. He was a post-doctoral fellow at Carnegie Mellon University after
receiving his PhD from the University of Rochester in 1995. He is on the

editorial board of the Journal of Machine Learning Research. For the past
eight years, McCallum has been active in research on statistical machine
learning applied to text, especially information extraction, document
classification, finite state models, and learning from combinations of
labeled and unlabeled data. Web page: http://www.cs.umass.edu/~mccallum.

2/3/04 – Eric Xing University of California, Berkeley, Computer Science

Generalized Mean Field Inference in Graphical Models

Recent years have seen increasing popularity of modeling complex phenomena in applied fields using "probabilistic graphical models", a formulism that exploits the conjoined talents of graphical theory and probability theory to build complex models out of simpler pieces. In this talk, I discuss a class of generalized mean field (GMF) inference algorithms for approximate inference in exponential family graphical models, and a novel combination of graph partitioning (GP) algorithms with the GMF algorithms, which potentially leads to a generic, autonomous, and distributed algorithm for deterministic approximate inference in arbitrary graphical models. I discuss the mathematical underpinnings for the GMF algorithms and formal relationships between GMF and GP. I also discuss applications of the GMF algorithms in both toy examples and a large-scale computational biology problem in genomic sequence analysis.

(Joint work with Michael Jordan and Stuart Russell.)

Eric Xing received his B.S. with honors in Physics and Biology from

Tsinghua university, his Ph.D. in Molecular Biology and Biochemistry

from Rutgers University and will soon complete his Ph.D. in Computer

Science at UC Berkeley. His early work in molecular biology focused on

the genetic mechanisms of human carcinogenesis and the mutational

spectrum of tumor suppressor genes. Then he moved into machine

learning and has worked on probabilistic graphical models, approximate

inference and pattern recognition. He is interested in studying

biological problems (in particular, systems biology, genetic inference

and evolution) using statistical learning approaches, theory and

application of graphical models, nonparametric Bayesian analysis and

semi-unsupervised learning.

2/10/04 – Jonathan Schaeffer University of Alberta, Department of Computing Science

Dealing with Uncertainty in Poker

Poker is a challenging problem for AI research: multiple agents (up to 10), stochastic element (cards being dealt), imperfect information (don't know the opponent's cards), user modeling (identifying player patterns), and risk management (betting decisions). For over 10 years the University of Alberta Computer Poker Group has been working on building a high-performance poker program. This work has led us through four distinct phases of program design:

1) knowledge-based system,

2) simulations,

3) game theory, and

4) tree searching with learning.

The prospects of a program successfully challenging

the best human players in the near future is excellent.

In this talk we will motivate the research, compare the

four different program designs, and discuss our future directions.

Jonathan Schaeffer is a professor in Computing Science at the University of Alberta doing research on artificial intelligence and parallel computing. He is best known for his work on AI in games. He is the author of the checkers program Chinook, the first program to win a human world championship.

2/24/04 – Naftali Tishby School of Computer Science and Engineering and Interdisciplinary Center for Neural Computation, Hebrew University

Efficient Data Representations that Preserve Information

A fundamental issue in computational learning theory, as well as in
biological and cognitive information processing, is the best possible
relationship between model representation complexity and its prediction
accuracy. Clearly, we expect more complex models that require longer data
representation to be more accurate. Can one provide a quantitative - yet
general - formulation of this tradeoff? In this talk I will discuss this
question from Shannon's Information-Theory perspective. I will argue that
this trade-off can be traced back to the basic duality between source and
channel coding and is also related to the notion of "coding with side
information". Moreover, using some resent results I will argue that when
formulated properly (as an "Information Bottleneck") the optimal
representation form a "perfectly matched source-channel" and is achievable
without delays or block-coding. I will review the theoretical achievability
results for such relevant data representations and discuss some of our
algorithms for extracting them. I will then demonstrate the application of
these ideas for the analysis of natural language corpora and speculate on
possibly-universal scaling between meaningful information and word-entropy
in human communication, that they reveal.

http://www.cs.huji.ac.il/~tishby/

2/27/04 – Michael Kearns University of Pennsylvania, CIS

Network Models for Game Theory and Economics

Over the last several years, a number of authors have developed graph-theoretic or network models for large-population game theory and economics. In such models, each player or organization is represented by a vertex in a graph, and payoffs and transactions are restricted to obey the topology of the graph. This allows the detailed specification of rich structure (social, technological, organizational, political, regulatory) in strategic and economic systems.

In this talk, I will survey these models and the attendant algorithms for certain basic computations, including Nash, correlated, and Arrow-Debreu equilibria. Connections to related topics, such as Bayesian and Markov networks for probabilistic modeling and inference, will be discussed. Time permitting, I will briefly discuss some recent experimental work marrying this general line of thought with topics in social network theory.

3/2/04 – Russ Greiner University of Alberta, Department of Computing Science

WebIC: An Effective "Complete-Web" Recommender System

Many web recommendation systems direct users to webpages, from a single website, that other similar users have visited. By contrast, our WebIC web recommendation system is designed to locate "information content (IC) pages" --- pages the current user needs to see to complete her task --- from essentially anywhere on the web. WebIC first extracts the "browsing properties" of each word encountered in the user's current click-stream --- eg, how often each word appears in the title of a page, or in the "anchor" of a link that was followed, etc. It then uses a user- and site-independent model, learned from a set of annotated web logs acquired in a user study, to determine which of these words is likely to appear in an IC page. We discuss how to use these IC-words to find IC-pages, and demonstrate empirically that this browsing-based approach works effectively.

Joint work with Tingshao Zhu, Gerald Häubl and Bob Price.

After earning a PhD from Stanford, Russ Greiner worked in both academic and industrial research before settling at the University of Alberta, where he is now a Professor in Computing Science and the founding Director of the Alberta Ingenuity Centre for Machine Learning. He is a Program Chair for the 2004 "International Conference on Machine Learning", an Editor-in-Chief for "Computational Intelligence", and serves on the editorial boards of a number of other journals. He has published over 90 refereed papers and patents, most in the areas of machine learning and knowledge representation. The main foci of his current work are (1) bioinformatics and medical informatics;(2) learning effective probabilistic models and (3) formal foundations of learnability.

For more information, see http://www.cs.ualberta.ca/~greiner/WebIC

Dr. Greiner is visiting CALD here at CMU this Spring.

3/30/04 – Henry Kautz University of Washington, Department for Computer Science & Engineering

Toward a Universal Inference Engine

In the early days of AI some researchers proposed that intelligent problem solving could be reduced to the application of general purpose theorem provers to an axiomatization of commonsense knowledge. Although automated first-order theorem proving was unwieldy, general reasoning engines for propositional logic turned out to be surprisingly efficient for a wide variety of applications. Yet many problems of interest to AI involve probabilities or quantification, and would seem to be beyond propositional methods. However, research research has shown that the basic backtrack search algorithm for satisfiability generalizes to a strikingly efficient approach for broader classes of inference. We describe our progress in building the world's fastest engine for model-counting (the core #P-completeproblem) by leveraging the techniques of clause learning and formula cacheing.

Henry Kautz is an Associate Professor in the Department of Computer

Science and Engineering at the University of Washington. He joined

the faculty in the summer of the year 2000 after a career at Bell

Labs and AT&T Laboratories, where he was Head of the AI Principles

Research Department. His academic degrees include an A.B. in

mathematics from Cornell University, an M.A. in Creative Writing

from the Johns Hopkins University, an M.Sc. in Computer Science

from the University of Toronto, and a Ph.D. in computer science

from the University of Rochester. He is a recipient of the "Computers

and Thought Award" from the International Joint Conference on

Artificial Intelligence and a Fellow of the American Association

for Artificial Intelligence. In 1998 he was elected to the

Executive Council of AAAI, and in 2000 was Program Chair for

the AAAI National Conference.

4/20/04 – Leslie Pack Kaelbling MIT Computer Science and Artificial Intelligence

Life-Sized Learning

In the last 10 years, the combination of techniques from machine

learning and operations research has allowed major advances in learning

and planning for uncertain environments. Reasonably large problems

can be solved using current techniques. But what if we want to scale

up to the uncertain learning and planning problem that you face every

day? It is many orders of magnitude larger than the biggest problem we

can solve currently.

In this talk, I'll describe three early pieces of work that try to

begin to address working in truly huge environments. The first is a

method for learning probabilistic rules to describe naive physics

models of the interactions between objects. The second is an uncertain

planning algorithm that uses the rules learned by the first method to

construct contingency plans that consider enough cases to perform

robustly, but are much smaller than complete policies. The last piece

is preliminary work on combining multiple abstraction methods

dynamically, in order to allow an agent to have a working model of the

environment that changes focus depending on the current situation.

4/27/04 – Mike Lewicki CMU, Center for the Neural Basis of Cognition and Computer Science Department

Probabilistic models for learning structure in natural scenes and sounds

Our perceptual systems are bewildering in their complexity. Although a

great deal has been learned about their structure and organization,

insights into the underlying information processing algorithms have

proved elusive. Recent theoretical work, however, has begun to shed

light on some of the underlying computational principles and has

provided explanations for both the properties of early perceptual

representations and their relationship to the natural environment. A

motivating hypothesis of these theories is that part of what makes

biological sensory systems so robust and powerful is their ability to

learn cues and features from the statistical regularities of natural

signals. By developing probabilistic models and algorithms for

learning structure from these regularities, we will both be able to

develop more robust artificial systems, and gain insight into the types

of representations and computations used by biological systems. I will

show that a simple set of theoretical principles can be used to learn

optimal codes for both natural images and sounds, and that these

representations can explain a wide range of complex neural properties

and provide new insights into their organization. I will also show

that this framework can be extended in order to learn abstract sensory

properties and invariant features. This framework allows us to make

novel predictions about the functional roles of many aspects of

biological visual and auditory systems.

Joint work with Yan Karklin and Evan Smith.

Dr. Lewicki received his B.S. degree in mathematics and cognitive

science from Carnegie Mellon University, his Ph.D. degree in

computation and neural systems from the California Institute of

Technology, and did postdoctoral studies in the Computational

Neurobiology Laboratory at the Salk Institute. His research involves

the study and development of computational and theoretical approaches

to the representation, processing, and learning of pattern structure in

natural visual and acoustic environments. He is the recipient of an

NSF CAREER award for his multidisciplinary work on the representation

of natural auditory scenes. He is currently an assistant professor in

the Computer Science Department at Carnegie Mellon University and in

the CMU-University of Pittsburgh Center for the Neural Basis of

Cognition.

6/8/04 – Pat Langley Computational Learning Laboratory, Center for the Study of Language and Information, Stanford University

Computational Induction of Explanatory Process Models

The growing amount of scientific data has led to the increased use of computational discovery methods to understand and interpret them. However, most work has relied on knowledge-lean techniques like clustering and classification learning, which produce descriptive rather than explanatory
models, and it has utilized formalisms developed in AI or statistics, so that results seldom make contact with current theories or scientific notations. In this talk, I present an approach to computational discovery that encodes explanatory scientific models as sets of quantitative processes, simulates these models' behavior over time, incorporates background knowledge to constrain model construction, and induces these models from time-series data in a robust manner. I illustrate this framework on data and models from Earth science and microbiology, two domains in which explanatory process accounts occur frequently. In closing, I describe our progress toward an interactive software
environment for the construction, evaluation, and revision of such explanatory scientific models.

This talk describes joint work with Kevin Arrigo, Nima Asgharbeygi, Stephen Bay, Andrew Pohorille, and Jeff Shrager.

http://cll.stanford.edu/~langley/

Carnegie Mellon University Computer Science Department