Next: Background
Up: Acquiring Word-Meaning Mappings for
Previous: Acquiring Word-Meaning Mappings for
A long-standing goal for the field of artificial intelligence is to
enable computer understanding of human languages. Much progress has
been made in reaching this goal, but much also remains to be done.
Before artificial intelligence systems can meet this goal, they first
need the ability to parse sentences, or transform them into a
representation that is more easily manipulated by computers.
Several knowledge sources are required for parsing, such as a grammar,
lexicon, and parsing mechanism.
Natural language processing (NLP) researchers have traditionally
attempted to build these knowledge sources by hand, often resulting in
brittle, inefficient systems that take a significant effort to
build. Our goal here is to overcome this ``knowledge acquisition
bottleneck'' by applying methods from machine learning. We develop
and apply methods from empirical or corpus-based NLP to
learn semantic lexicons, and from active learning to reduce the
annotation effort required to learn them.
The semantic lexicon
is one NLP component that is typically challenging and time consuming to
construct and update by hand. Our notion of semantic lexicon,
formally defined in Section 3, is that
of a list of phrase-meaning pairs, where the meaning representation
is determined by the language understanding task at hand, and where we
are taking a compositional view of sentence meaning [Partee
et al.1990].
This paper describes a system, WOLFIE
(WOrd Learning From Interpreted Examples), that acquires a semantic
lexicon of phrase-meaning pairs from a corpus of sentences paired with
semantic representations. The goal is to automate lexicon
construction for an integrated NLP system that acquires both semantic
lexicons and parsers for natural language interfaces from a single
training set of annotated sentences.
Although many others
[Sébillot et al.2000,Riloff Jones1999,Siskind1996,Hastings1996,Grefenstette1994,Brent1991]
have presented systems for learning information about lexical
semantics, we present here a system for learning
lexicons of phrase-meaning pairs.
Further, our work is unique in its combination of several
features, though prior work has included some of these aspects.
First, its output can be used by a system, CHILL
[Zelle Mooney1996,Zelle1995], that learns to parse sentences
into semantic representations. Second, it uses a fairly
straightforward batch, greedy, heuristic learning algorithm that
requires only a small number of examples to generalize well. Third, it is
easily extendible to new representation formalisms. Fourth, it
requires no prior knowledge although it can exploit an initial lexicon
if provided. Finally, it simplifies the learning problem by making
several assumptions about the training data, as described further in Section 3.2.
We test WOLFIE's ability to acquire a semantic lexicon for a
natural language interface to a geographical database using a corpus of
queries collected from human subjects and annotated with their logical
form. In this test, WOLFIE is integrated with CHILL, which learns
parsers but requires a semantic lexicon (previously built
manually). The results demonstrate that the final acquired parser
performs nearly as accurately at answering novel questions when using
a learned lexicon as when using a hand-built lexicon. WOLFIE is also
compared to an alternative lexicon acquisition system developed by
Siskind (1996), demonstrating superior performance on this
task. Finally, the corpus is translated into Spanish, Japanese, and
Turkish, and experiments are conducted demonstrating an ability to
learn successful lexicons and parsers for a variety of languages.
A second set of experiments demonstrates WOLFIE's ability to scale
to larger and more difficult, albeit artificially generated, corpora.
Overall, the results demonstrate a robust ability to acquire accurate
lexicons directly usable for semantic parsing. With such an
integrated system, the task of building a semantic parser for a new
domain is simplified. A single representative corpus of
sentence-representation pairs allows the acquisition of both a
semantic lexicon and parser that generalizes well to novel sentences.
While building an annotated corpus is arguably less work than building
an entire NLP system, it is still not a simple task. Redundancies and
errors may occur in the training data. A goal should be to also
minimize the annotation effort, yet still achieve a reasonable level
of generalization performance. In the case of natural language, there
is frequently a large amount of unannotated text available. We would
like to automatically, but intelligently, choose which of the
available sentences to annotate.
We do this here using a technique called active learning.
Active learning is a research area in machine learning that
features systems that automatically select the most informative
examples for annotation and training [Cohn et al.1994]. The primary
goal of active learning is to reduce the number of examples that the
system is trained on, thereby reducing the example annotation cost,
while maintaining the accuracy of the acquired information. To
demonstrate the usefulness of our active learning techniques, we
compared the accuracy of parsers and lexicons learned using examples
chosen by active learning for lexicon acquisition,
to those learned using randomly chosen
examples, finding that active learning saved significant annotation
cost over training on randomly chosen examples. This savings is
demonstrated in the geography query domain.
In summary, this paper provides a new statement of the lexicon
acquisition problem and demonstrates a machine learning technique for
solving this problem. Next, by combining this with previous
research, we show that an entire natural language interface can be
acquired from one training corpus. Further, we demonstrate the
application of active learning techniques to minimize the number of
sentences to annotate as training input for the integrated learning
system.
The remainder of the paper is organized as follows.
Section 2 gives more background information on CHILL
and introduces Siskind's lexicon acquisition system, which we will
compare to WOLFIE in Section 5.
Sections 3 and 4 formally define the
learning problem and describe the WOLFIE algorithm in detail. In
Section 5 we present and discuss experiments evaluating
WOLFIE's performance in learning lexicons in a database query domain
and for an artificial corpus.
Next, Section 6 describes and evaluates our use of active
learning techniques for WOLFIE. Sections 7
and 8 discuss related research and future directions,
respectively. Finally, Section 9 summarizes our
research and results.
Next: Background
Up: Acquiring Word-Meaning Mappings for
Previous: Acquiring Word-Meaning Mappings for
Cindi Thompson
2003-01-02