Abstract
We will use link grammar as our grammatical base. Being highly lexical in nature, the link grammar formalism will allow us to integrate more traditional modeling schemes with grammatical ones. An efficient robust link grammar parser will assist in this undertaking.
We will initially build finite state-based language models that will utilize relatively simple grammatical information, such as part-of-speech data, along with information sources used by other language models. Our models feature a new framework for probabilistic automata that makes use of hidden data to construct context-sensitive probabilities. The maximum entropy principle employed by these {\em Gibbs-Markov models} facilitates the easy integration of multiple information sources.
We will also build language models that take greater advantage of link grammar by including more sophisticated grammatical considerations. These models will include both probabilistic automata as well as models more closely related to the link grammar formalism.
Expected contributions of this work will be to demonstrate that grammatical information can be used to construct language models with low perplexity, and that such models can be used to reduce the error rates of speech recognition systems.