Next: Background Up: Acquiring Word-Meaning Mappings for Previous: Acquiring Word-Meaning Mappings for

Introduction and Overview

A long-standing goal for the field of artificial intelligence is to enable computer understanding of human languages. Much progress has been made in reaching this goal, but much also remains to be done. Before artificial intelligence systems can meet this goal, they first need the ability to parse sentences, or transform them into a representation that is more easily manipulated by computers. Several knowledge sources are required for parsing, such as a grammar, lexicon, and parsing mechanism. Natural language processing (NLP) researchers have traditionally attempted to build these knowledge sources by hand, often resulting in brittle, inefficient systems that take a significant effort to build. Our goal here is to overcome this ``knowledge acquisition bottleneck'' by applying methods from machine learning. We develop and apply methods from empirical or corpus-based NLP to learn semantic lexicons, and from active learning to reduce the annotation effort required to learn them. The semantic lexicon is one NLP component that is typically challenging and time consuming to construct and update by hand. Our notion of semantic lexicon, formally defined in Section 3, is that of a list of phrase-meaning pairs, where the meaning representation is determined by the language understanding task at hand, and where we are taking a compositional view of sentence meaning [Partee et al.1990]. This paper describes a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon of phrase-meaning pairs from a corpus of sentences paired with semantic representations. The goal is to automate lexicon construction for an integrated NLP system that acquires both semantic lexicons and parsers for natural language interfaces from a single training set of annotated sentences. Although many others [Sébillot et al.2000,Riloff Jones1999,Siskind1996,Hastings1996,Grefenstette1994,Brent1991] have presented systems for learning information about lexical semantics, we present here a system for learning lexicons of phrase-meaning pairs. Further, our work is unique in its combination of several features, though prior work has included some of these aspects. First, its output can be used by a system, CHILL [Zelle Mooney1996,Zelle1995], that learns to parse sentences into semantic representations. Second, it uses a fairly straightforward batch, greedy, heuristic learning algorithm that requires only a small number of examples to generalize well. Third, it is easily extendible to new representation formalisms. Fourth, it requires no prior knowledge although it can exploit an initial lexicon if provided. Finally, it simplifies the learning problem by making several assumptions about the training data, as described further in Section 3.2. We test WOLFIE's ability to acquire a semantic lexicon for a natural language interface to a geographical database using a corpus of queries collected from human subjects and annotated with their logical form. In this test, WOLFIE is integrated with CHILL, which learns parsers but requires a semantic lexicon (previously built manually). The results demonstrate that the final acquired parser performs nearly as accurately at answering novel questions when using a learned lexicon as when using a hand-built lexicon. WOLFIE is also compared to an alternative lexicon acquisition system developed by Siskind (1996), demonstrating superior performance on this task. Finally, the corpus is translated into Spanish, Japanese, and Turkish, and experiments are conducted demonstrating an ability to learn successful lexicons and parsers for a variety of languages. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. Overall, the results demonstrate a robust ability to acquire accurate lexicons directly usable for semantic parsing. With such an integrated system, the task of building a semantic parser for a new domain is simplified. A single representative corpus of sentence-representation pairs allows the acquisition of both a semantic lexicon and parser that generalizes well to novel sentences. While building an annotated corpus is arguably less work than building an entire NLP system, it is still not a simple task. Redundancies and errors may occur in the training data. A goal should be to also minimize the annotation effort, yet still achieve a reasonable level of generalization performance. In the case of natural language, there is frequently a large amount of unannotated text available. We would like to automatically, but intelligently, choose which of the available sentences to annotate. We do this here using a technique called active learning. Active learning is a research area in machine learning that features systems that automatically select the most informative examples for annotation and training [Cohn et al.1994]. The primary goal of active learning is to reduce the number of examples that the system is trained on, thereby reducing the example annotation cost, while maintaining the accuracy of the acquired information. To demonstrate the usefulness of our active learning techniques, we compared the accuracy of parsers and lexicons learned using examples chosen by active learning for lexicon acquisition, to those learned using randomly chosen examples, finding that active learning saved significant annotation cost over training on randomly chosen examples. This savings is demonstrated in the geography query domain. In summary, this paper provides a new statement of the lexicon acquisition problem and demonstrates a machine learning technique for solving this problem. Next, by combining this with previous research, we show that an entire natural language interface can be acquired from one training corpus. Further, we demonstrate the application of active learning techniques to minimize the number of sentences to annotate as training input for the integrated learning system. The remainder of the paper is organized as follows. Section 2 gives more background information on CHILL and introduces Siskind's lexicon acquisition system, which we will compare to WOLFIE in Section 5. Sections 3 and 4 formally define the learning problem and describe the WOLFIE algorithm in detail. In Section 5 we present and discuss experiments evaluating WOLFIE's performance in learning lexicons in a database query domain and for an artificial corpus. Next, Section 6 describes and evaluates our use of active learning techniques for WOLFIE. Sections 7 and 8 discuss related research and future directions, respectively. Finally, Section 9 summarizes our research and results.

Next: Background Up: Acquiring Word-Meaning Mappings for Previous: Acquiring Word-Meaning Mappings for

Cindi Thompson
2003-01-02