Abstract
For Hidden Markov Models (HMMs) with fully connected transition models, the
three fundamental problems of evaluating the likelihood of an observation
sequence, estimating an optimal state sequence for the observations, and
learning the model parameters, all have quadratic time complexity in the
number of states. We introduce a novel class of non-sparse Markov transition
matrices called Dense-Mostly-Constant (DMC) transition matrices that allow
us to derive new algorithms for solving the basic HMM problems in
sub-quadratic time. We describe the DMC HMM model and algorithms and attempt
to convey some intuition for their usage. Empirical results for these
algorithms show dramatic speedups for all three problems. In terms of
accuracy, the DMC model yields strong results and outperforms the
baseline algorithms even in domains known to violate the DMC assumption.
Most statistical approaches to modeling text implicitly assume that
informative words are rare. This assumption is usually appropriate for
topical retrieval and classification tasks; however, in non- topical
classification and soft-clustering problems where classes and latent
variables relate to sentiment or author, informative words can be
frequent. In this paper we present a comprehensive set of statistical
learning tools which treat words with higher frequencies of occurrence
in a sensible manner. We introduce probabilistic models of contagion
for classification and soft-clustering based on the Poisson and
Negative-Binomial distributions, which share with the Multinomial the
desirable properties of simplicity and analytic tractability. We then
introduce the Delta-Square statistic to select features and avoid
over-fitting.
|
Pradeep Ravikumar Last modified: Thu Nov 10 11:41:07 EST 2005