NBAYES

Sinopsis

Naive Bayes: using RAINBOW

Description

A brief

The Naive Bayes probabilistic classifiers (NaiveBayes) is another commonly-used learning approach to text categorization. The basic idea is to use the conditional probabilities of categories given a word to estimate the probabilities of categories given a document. The naive part of such a model is the assumption of word independence. The simplicity of this assumption makes the computation far more efficient than the exponential complexity of non-naive Bayes or DTree approaches because it does not use word combinations as predictors.

Usage

We use the "rainbow" package for this task. rainbow has also capabilities for other methods like kNN, but we describe only Naive Bayes.
Please read the help file for more details, or a short one for the argument list.
scripts are in /afs/cs/academic/class/11741-s98/rainbow/ The latest version is mainteined by Andrew McCallum at:

Source: /afs/cs/project/theo-9/webkb/mccallum/src/bow
Linux binaries: /afs/cs/project/theo-9/webkb/mccallum/src/bow-linux
SUNOS binaries: /afs/cs/project/theo-9/webkb/mccallum/src/bow-sunos
Documentation can be found at http://www.cs.cmu.edu/~mccallum/bow and http://www.cs.cmu.edu/~mccallum/bow/rainbow.
A good introduction was written by Andrew and can be found at http://www.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes/gentle_intro.html