Abstract
While supervised kernel techniques involving a single latent function (or
discriminant) are well understood and powerful methods have emerged,
much less is known about models which combine a number of latent
functions. We describe a generic way of generalizing the sparse Bayesian
Gaussian process Informative Vector Machine (IVM) to such multi process
models, emphasizing the key techniques which are required for an
efficient solution (exploiting matrix structure, numerical quadrature).
We apply our method to the multi-way classification problem, obtaining
a scheme which scales essentially linearly in the number of datapoints
and classes. We show how kernel parameters can be learned by empirical
Bayesian techniques. We argue that a good solution for the multi-class
problem leads to schemes for larger structured graphical models such as
conditional random fields.
Joint work with Michael Jordan.
|
Pradeep Ravikumar Last modified: Sun Aug 22 14:59:03 EDT 2004