Syllabus and (tentative) Course Schedule

 
Date Lecture Topics Readings and useful links
Handouts
Module 1
Intro to Functional Approximation
Mon 1/11 1.Overview and Decision Trees

Lecturer: Eric Xing
Slides (Annotated Slides)
Overview of Machine Learning
  • Why Machine Learning?
  • Designing a learning system
  • Issues in Machine Learning

Decision Trees
  • Representation
  • ID3 learning algorithm
  • Entropy, Information gain
  • overfitting
Mitchell: Chap 1,3
Decision Tree Learning [Applet]

Wed 1/13 2.Probability Review

Lecturer: Aarti Singh
Slides (Annotated Slides)
Probability basics
  • Kolmogorov Axioms
  • Random variables (discrete, continuous)
  • Independence
  • Bayes rule
  • Joint distribution and inference

Density estimation
  • Maximum Likelihood estimate
  • Maximum A Posteriori estimate
Bishop: Chap 1, 2
Probability for Data Miners by Andrew Morre.

HW1 out
Mon 1/18 3.Instance-based "Learning"

Lecturer: Eric Xing
Slides (Annotated Slides)
Introduction to Classification Theory:

1. Bayesian Optimal Classifier

2. Nonparametric Methods & Instance-based Learning
  • Bayesian decision rule
  • Bayes error
  • Parzen and nearest neignbor density estimation
  • K-nearest neighbor (kNN) classifier
Case study: classification of text documents
Bishop: Chap 2.5
Fukunaga (Intro to Statistical PR) Tutorial on another instance of "instance-based" learning: locally weighted regression, by Andrew Moore.

Approximating Linear Seperation Function
Wed 1/20 4.Naive Bayes

Lecturer: Tom Mitchell
Slides (Annotated Slides)
Generative classifiers:
  • Naive Bayes classifiers with discrete and continuous (Gaussian) features
Case study: classification of text documents
Naive Bayes classifiers [Applet].
Naive Bayes and Logistic Regression, Mitchell's chapter draft.
Bishop: Chap 4
HW1 Due
HW2 out
Mon 1/25 5.Logistic Regression

Lecturer: Tom Mitchell
Slides (Annotated Slides)
Discriminative classifiers :
  • Logistic regression [Applet]
  • Relationship to Naive Bayes
Case study: comparison of LR and NB on text mining
Naive Bayes and Logistic Regression, Mitchell chapter draft.
Bishop: Chap 4, 5
Mitchell: Chap 4
On Discriminative and Generative Classifiers, Ng and Jordan, NIPS, 2001.
Wed 1/27 6.Linear Regression

Lecturer: Aarti Singh
Slides (Annotated Slides)
Discriminative classifiers:
  • Discriminative vs generative classifiers
  • Regression
  • Linear regression and its probabilistic interpretation as MLE
  • Regularized linear regression and MAP
  • Nonlinear regression (Polynomial, Nonlinear basis, Locally-weighted/Kernel Regression, Trees)
Linear regression [Applet].
Bishop: Chap 3
Mitchell: Chap 8.3
Tutorial on regression by Andrew Moore.

Mon 2/1 7. Neural Networks
Lecturer: Tom Mitchell
Slides (Annotated Slides)
Neural networks slides recommended reading Mitchell Ch. 4
Wed 2/3 8. Model Selection
Lecturer: Aarti Singh
Slides (Annotated Slides)
  • Overfitting
  • Bias-Variance Decomposition
  • Model Selection (Cross Validation, SRM, Complexity regularization, Information Criteria)
Bishop: Chap 1, 2
Mitchell: Chap 5, 6
Matlab demo code for understanding overfitting

Model comparison and Occam's Razor,Chapter 28 from David Mackay's book
Model selection and Minimum Description Length principle,Mark Hansen and Bin Yu, J. Amer. Statist. Assoc. vol.96,746-774, 2001.
HW2 due
HW3 out
Mon 2/8 Class Canceled : CMU was closed due to the snow storm.
Wed 2/10 Class Canceled: CMU was closed due to the snow storm.
Clustering
Mon 2/15 9. K-means and Hierarchical Clustering
Lecturer: Aarti Singh
Slides (Annotated Slides)
Introduction to Unsupervised Learning
Clustering
Bishop: Chap 9
Wed 2/17 10.Probabilistic Models for Clustering
Lecturer: Aarti Singh
Slides
Mixture model
The Theory of Expectation-Maximization [Applet: Mixture of Gaussians]
Bishop: Chap 9
Introduction to Graphical Models
Mon 2/22 11.HMM and Bayesian Network I
Lecturer: Eric Xing
Slides (Annotated Slides)
Bayesian Network I: Representation and Inference
  • HMM representation
  • Evaluating marginal probabilities: Forward Algorithm
  • Inference:
    • Forward-backward Algorithm
    • Viterbi Decoding
Bishop: Chap 8
Kevin Murphy's tutorial
BayesNet Toolbox in Matlab by Kevin Murphy
HW3 Due
Wed 2/24 12.Bayesian Network II (HMM) and CRF.
Lecturer: Eric Xing
Slides (Annotated Slides)
HMM
  • Viterbi continued.
  • Learning: Baum-Wallach algorithm
Conditional Random Field (CRF):
  • Representation
  • Inference and learning.
Same as Lecture 17 Project Proposal Due
Mon 3/1
13.Bayesian Network III: Representation and Learning
Lecturer: Eric Xing
Slides (Annotated Slides)
  • Bayesian network semantic.
  • Conditional independence and D-Separation
  • Parameter learning for fully observed BN.
Bishop: Chap 8
Wed 3/3 Midterm Exam open book, open notes, no computers
Mon 3/8 Spring Break
Wed 3/10 Spring Break
Module 2:TBA
Mon 3/15 14.Bayesian Networks IV: Exact Inference
Lecturer: Eric Xing
Slides (Annotated Slides)
Learning: fully observed models.
Inference:
  • Variable Elimination
  • Junction trees and message passing
Bishop: Chap 13
Wed 3/17 15. Learning Theory I
Lecturer: Tom Mitchell
 Annotated Slides
Computational Learning Theory
  • Probably approximately correct (PAC) learning
  • sample complexity
  • VC-dimension
 Mitchell: Chap 7
Mon 3/22 16. Learning Theory II
Lecturer: Tom Mitchell
Slides (Annotated Slides)
Computational Learning Theory II
  • Agnostic learning
  • Mistake bounds
  • Weighted Majority algorithm
 Mitchell: Chap 7
Wed 3/24 17. Support Vector Machines I
Lecturer: Eric Xing
Slides (Annotated Slides)
  • Max-margin classification and SVM
  • Lagrangian Duality and KKT conditions
  • Solving Optimal margin Classifiers
  • The non-separable case: soft-margin and slack variables
  • SMO: sequential minimal optimization
Mon 3/29 18. Support Vector Machines II
Lecturer: Eric Xing
Slides (Annotated Slides)
  • Kernel methods
  • Maximum-entropy Discrimination
  • Structured SVM
Wed 3/31 19. Boosting
Lecturer: Aarti Singh
Slides
  • Combining weak classifiers
  • Adaboost
  • Comparison with logistic regression
Project Progress Report Due
Mon 4/5 20. Dimensionality Reduction
Lecturer: Aarti Singh
Slides
  • Feature Selection
  • Identifying latent features
  • Linear Methods - PCA
  • Nonlinear Methods - ISOMAP
Wed 4/7 21. Spectral Clustering
Lecturer: Aarti Singh
Slides
  • Graph-Theoretic Methods for Clustering
  • Graph Laplacian
  • Balanced min-cut
  • Spectral clustering
HW4 due HW5 out
Mon 4/12 22. Structure Learning I
Lecturer: Eric Xing
Slides (Annotated Slides)
  • Graphical Gaussian Model
  • Neighborhood selection
    • Graphical lasso
    • Sparsistency
  • Time-varying GGM
Wed 4/14 23. Structure Learning II: Bayesian Structure Learning
Guest Lecturer: Zoubin Ghahramani
Slides (Annotated Slides)
  • Parameter learning in directed models:
    • complete and incomplete data
    • ML and Bayesian methods
  • Bayesian model comparison and Occam’s Razor
  • Structure learning in directed models: complete and incomplete data
  • Causality
  • Parameter and Structure learning in undirected models
Mon 4/19 24. Semi-Supervised Learning
Lecturer: Aarti Singh
Slides
  • Semi-Supervised Learning
  • Learning from labeled and unlabeled data
  • Generative Mixture model approach
  • Co-training and Multi-view methods
  • Graph regularization
Wed 4/21 25. Active Learning
Lecturer: Aarti Singh
Slides
  • Active learning
  • Feedback-driven sequential learning
  • Binary bisection search
  • Query-by-Committee
  • Density weighting,Estimated Error Reduction
Active learning literature survey HW5 due
Mon 4/26 26. Reinforcement Learning I
Lecturer: Tom Mitchell
Slides
Reinforcement Learning
  • Markov Decision Processes
  • Q learning
Reinforcement Learning: A Survey, Kaelbling et al., JAIR, 1995.
Wed 4/28 27. Reinforcement Learning II
Lecturer: Tom Mitchell
Slides 
Reinforcement Learning
  • Models of human learning
Quick course summary
Tuesday May 4th Poster Session NSH Atrium, 3:00pm-6:00pm
Friday May 7th Final Exam DH 2302, 5:30pm-8:30pm
 

© 2008 Eric Xing @ School of Computer Science, Carnegie Mellon University
[validate xhtml]