A New Look at Basic Classifiers

Sam Roweis

Department of Computer Science

University of Toronto

http://www.cs.toronto.edu/~roweis

Abstract:

In the published literature, Naive Bayes, Logistic Regression and K-Nearest Neighbours are often used as the "punching bags" of classification algorithms, playing the roles of straw-men against which our latest and greatest innovations are favourably compared. However, in practical applications, these classifiers are extremely simple and robust to implement, can be quite fast to train/test and often perform surprisingly well. Rather than be distressed that such stupid algorithms are so frustratingly competitive, we can instead embrace these approaches, take them seriously, and ask how we can make them work even better. In this talk, I'll discuss three simple ideas, one for improving the performance of Naive Bayes by removing redundant features, one for extending logistic regression to include a model of the input density and one for learning, from data, a distance metric for use in KNN classifiers.

Speaker Bio:

Sam Roweis is an Assistant Professor in the Department of Computer Science at the University of Toronto. His research interests are in machine learning, data mining, and statistical signal processing. Roweis did his undergraduate degree at the University of Toronto in the Engineering Science program and earned his doctoral degree in 1999 from the California Institute of Technology working with John Hopfield. He did a postdoc with Geoff Hinton and Zoubin Ghahramani at the Gatsby Unit in London, and was a visiting faculty member at MIT in 2005. He has also worked at several industrial research labs including Bell Labs, Whizbang! Labs and Microsoft. He is the holder of a Canada Research Chair in Statistical Machine Learning, a Sloan Research Fellowship and the winner of a Premier's Research Excellence Award.