CMU Artificial Intelligence Repository
Brill: Trainable Part of Speech Tagger
areas/nlp/parsing/taggers/brill/
This directory contains Eric Brill's trainable rule-based part of
speech tagger. This tagger is based on transformation-based
error-driven learning, a technique that has been effective in a number
of natural language applications, including part of speech and word
sense tagging, prepositional phrase attachment, and syntactic parsing.
The code includes a tokenizer for ASCII English, an English lexicon
enduced from the Brown corpus, a table of mappings for word suffixes
to likely ambiguity classes, and an HMM trained on the odd numbered
sentences in the Brown corpus.
For more information, see chapter 6 of Brill's thesis.
Origin:
ftp.cs.jhu.edu:/pub/brill/Programs/
ftp.cs.jhu.edu:/pub/brill/Papers/
Version: 1.13 (21-JUN-94)
Requires: Common Lisp
Copying: Copyright (c) 1993 by MIT
Use, copying, modification, and distribution permitted.
CD-ROM: Prime Time Freeware for AI, Issue 1-1
Mailing List: If you wish to be on the mailing list for future
releases, bug reports, etc, please send mail to the author.
Author(s): Eric Brill
or
Keywords:
Authors!Brill, Error-Driven Learning, HMM, Lisp!Code,
Machine Learning, NLP, Parsing, Part of Speech Taggers,
Taggers
References: ?
Last Web update on Mon Feb 13 10:27:02 1995
AI.Repository@cs.cmu.edu