Inducing Features of Random Fields
Abstract
We present a technique for constructing random fields
from a set of training samples. The learning paradigm builds
increasingly complex fields by allowing potential functions, or
features, that are supported by increasingly large subgraphs. Each
feature has a weight that is trained by minimizing the
Kullback-Leibler divergence between the model and the empirical
distribution of the training data. A greedy algorithm determines how
features are incrementally added to the field and an iterative scaling
algorithm is used to estimate the optimal values of the weights.
The random field models and techniques introduced in this paper differ
from those common to much of the computer vision literature in that
the underlying random fields are non-Markovian and have a large number
of parameters that must be estimated. Relations to other learning
approaches including decision trees and Boltzmann machines are given.
As a demonstration of the method, we describe its application to the
problem of automatic word classification in natural language
processing.