Abstract
Data sets with many discrete variables and relatively few cases arise in
health care, ecommerce, information security, text mining, and many
other domains. Learning effective and efficient prediction models from
such data sets is a challenging task. In this paper, we propose a Tabu
Search enhanced Markov Blanket (TS/MB) procedure to learn a graphical
Markov Blanket classifier from data. The TS/MB procedure is based on the
use of restricted neighborhoods in a general Bayesian Network
constrained by the Markov condition, called Markov Blanket
Neighborhoods. Computational results from real world data sets drawn
from several domains indicate that the TS/MB procedure is able to find a
parsimonious model with substantially fewer predictor variables than in
the full data set, and provides comparable or better prediction
performance when compared against several machine learning methods.
|
Pradeep Ravikumar Last modified: Mon Jan 10 12:34:34 EST 2005