Abstract
The nearest neighbor rule for classification is useful whenever it is
difficult to model each class with generative approaches. The
classification performance of NN rule is critically dependent on the
distance measure used for finding the nearest neighbor. A distance
measure that minimizes the mis-classification risk for the 1-nearest
neighbor search can be shown to be the probability that a pair of
input measurements belong to different classes, a result that is also
known in the literature on Canonical Distortion Measures where a
different criteria is minimized. This pair-wise probability is not in
general a metric distance measure, although somewhat surprisingly it
does satisfy the triangle inequality. Furthermore, it can out-perform
any metric distance, approaching even the Bayes optimal performance.
Although in theory we can express the optimal distance measure in terms of generative models, in practice this approach is limited by the ability to estimate generative models from data. Instead, we us a linear logistic model for the optimal distance measure that combines the discriminative powers of more elementary distance measures associated with a collection of simple feature spaces that are easy and efficient to implement. For performing efficient nearest neighbor search over large training sets, the linear model was extended to discretized distance measures that combines distance measures associated with discriminators. The discrete model was combined with the continuous model to yield a hierarchical distance model that is both fast as well as accurate. We will also discuss connections to boosting and the maximum entropy approach. |
Charles Rosenberg Last modified: Thu Oct 3 22:10:07 EDT 2002