Machine Learning and Friends at Carnegie Mellon

Abstract

Given an unlabeled unbalanced data set, the goal of rare category detection is to discover examples from the minority classes with a few label requests. Rare category detection is an open challenge in machine learning, and it has a lot of applications, such as financial fraud detection, network intrusion detection, astronomy, spam image detection, etc. In this talk, I will introduce two methods for rare category detection with spatial data. The first one essentially performs local density differential sampling, and it requires the prior information about the data set as input. The second one is based on specially designed exponential families, and it is prior-free. Experimental results demonstrate the effectiveness of these methods on different real data sets.

Venue, Date, and Time

Venue: Newell Simon Hall 1507

Date: Monday, Nov 17, 2008

Time: 12:00 noon