10-701 Machine Learning Fall 2007

Basics

What is learning?
- Version spaces
- Sample complexity
- Training set/Test set split
Point estimation
- Loss functions
- MLE
- Bayesian
- MAP
- Bias-Variance trade off

Mon., Sep. 10:

Lecture: What's ML, Point estimation [Slides] [Annotated]
Mathematica Demonstration The Mathematica demonstrations require the newest version of Mathematica (Version 6) which can be obtained from MyAndrew.
Additional Reference: Andrew Moore's basic probability tutorial
Readings: Bishop 2.1, Appendix B

Linear Models

Linear regression [Applet]
http://www.mste.uiuc.edu/users/exner/java.f/leastsquares/
Bias-Variance tradeoff
Overfitting
Bayes optimal classifier
Naive Bayes [Applet]
http://www.cs.technion.ac.il/~rani/LocBoost/
Logistic regression [Applet]
Discriminative v.Generative models [Applet]

Wed., Sep. 12:

Lecture: Gaussians, Linear Regression, Bias-Variance Tradeoff, Overfitting, What's ML revisited. [Slides] [Annotated]
Readings: Bishop 1.1 to 1.4, Bishop 3.1, 3.1.1, 3.1.4, 3.1.5, 3.2, 3.3, 3.3.1, 3.3.2
Completely Optional: Joey's quickly written notes on the matrix MLE for regression. [PDF] [Mathematica6 Notebook] If there are any typos or mistakes please let me know .

Mon., Sep 17:

Lecture (Eric Xing): Naive Bayes, Gaussian Naive Bayes [Slides] [Annotated]
Readings: Bishop 1.3, 1.5, 3.2, Mitchell's Chapter on Naive Bayes and Logistic Regression (Sections 1 and 2)

Wed., Sep 19:

Lecture: Overfitting, What's learning revisited, Generative v. Discriminative, Logistic Regression [Slides] [Annotated]
Required Reading: Mitchell's Chapter on Naive Bayes and Logistic Regression (All sections)
Optional Reading: Ng and Jordan's NIPS 2001 paper on Discriminative versus Generative Learning [pdf] [ps]

Mon., Sep 24:

Lecture: Logistic Regression [Slides] [Annotated]
Readings: Bishop - 4.0, 4.2, 4.3, 4.4, 4.5

Non-linear models and Model selection (4 Lectures)

Decision trees [Applet]
Overfitting, again
Regularization
MDL
Cross-validation
Boosting [Adaboost Applet] from www.cse.ucsd.edu/~yfreund/adaboost
Instance-based learning [Applet] from www.site.uottawa.ca/~gcaron/applets.htm
- K-nearest neighbors
- Kernels
Neural nets [CMU Course] from www.cs.cmu.edu/afs/cs/academic/class/15782-s04/ [Applet] from http://neuron.eng.wayne.edu/bpFunctionApprox/bpFunctionApprox.html

Wed., Sep. 26:

Lecture: Decision Trees [Slides] [Annotated]
Readings: (Bishop - 1.6) Information Theory
(Bishop - 14.4) Tree-based Models
Recommended Reading: Quantities of Information Wikipedia entry
Recommended Reading: Nils Nilsson's Chapter (All Sections): Decision Trees
Optional Review of Boolean Logic/DNF: Nils Nilsson's Chapter Boolean Functions (first 4 pages)

Mon., Oct. 1:

Lecture: Boosting [Slides] [Annotated]
Readings: (Bishop 14.3) Boosting
Schapire Boosting Tutorial
Optional Reading: Multi-class AdaBoost paper, by Zhu, Rosset, Zou, and Hastie.
Additional resource: Schapire Boosting Tutorial Video.

Wed., Oct. 3:

Homework 1 is due at the beginning of lecture.
Lecture: Cross Validation, Simple Model Selection, Regularization, MDL, Neural Nets [Slides] [Annotated]
Readings: (Bishop 1.3) Model Selection / Cross Validation
(Bishop 3.1.4) Regularized least squares
(Bishop 5.1) Feed-forward Network Functions
Optional Reading: Ron Kohavi's paper, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection.
Additional Resource: Minimum Description Length website

Mon., Oct. 8:

Lecture: Neural Nets [Slides]
Readings: (Bishop 5.1) Feed-forward Network Functions
(Bishop 5.2) Network Training
(Bishop 5.3) Error Backpropagation

Margin-based approaches (3 Lectures)

SVMs [Applets] from www.site.uottawa.ca/~gcaron/applets.htm
Kernel trick

Wed., Oct. 10:

Lecture: Neural Nets (cont), Instance-based Learning [Slides] [Annotated]
Readings: (Bishop 2.5) Nonparametric Methods

Mon., Oct. 15:

Lecture: SVMs [Slides] [Annotated]
Readings: (Bishop 6.1,6.2) Kernels
(Bishop 7.1) Maximum Margin Classifiers
Hearst 1998: High Level Presentation
Burges 1998: Detailed Tutorial
(Optional) Platt 1998: Training SVMs with Sequential Minimal Optimization
Additional Resource: Smola video tutorial on SVM (see Part 3)
Additional Resource: Scholkopf video tutorial on kernels
Additional Resource: http://www.svms.org

Wed., Oct. 17:

Lecture: SVMs - The Kernel Trick [Slides] [Annotated]
Additional Resource: http://www.kernel-machines.org

Learning Theory (2 Lectures)

Sample complexity
PAC learning [Applets]
www.site.uottawa.ca/~gcaron/applets.htm
Error bounds
VC-dimension
Margin-based bounds
Large-deviation bounds
- Hoeffding's inequality, Chernoff bound
Mistake bounds
No Free Lunch theorem

Mon., Oct. 22:

Lecture: Learning Theory [Slides] [Annotated]
Readings:Goldman's COLT survey, sections 1-3.1
Avrim Blum's course handout on tail inequalities
(Optional) John Langford's tutorial on generalization bounds
(Optional) Littlestone's original (excellent) paper on the Mistake Bound model: Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm
Additional Resource: Langford video tutorial on generalization bounds
Additional Resource: John Shawe-Taylor video tutorial on statistical learning theory
Additional Resource: http://www.learningtheory.org

Wed., Oct. 24:

Lecture: Learning Theory, Midterm review [Slides] [Annotated]

Midterm

Thu., Oct 25 5-6:30pm
location: MM A14

Structured Models (4 Lectures)

HMMs
- Forwards-Backwards
- Viterbi
- Supervised learning
Graphical Models
- Applet: Java Bayes
- Representation
- Inference
- Learning
- BIC

Mon., Oct. 29:

Lecture: Bayes nets - Representation [Slides] [Annotated]
Readings: (Bishop 8.1,8.2) Bayesian Networks

Wed., Oct. 31:

Lecture: Bayes nets - Representation (cont.), Inference [Slides] [Annotated]
Readings: (Bishop 8.1,8.2) Bayesian Networks

Mon., Nov. 5:

Lecture: BNs inference, HMMs [Slides] [Annotated]
Readings: (Bishop 8.4.1,8.4.2) - Inference in Chain/Tree Structures
Rabiner's Detailed HMMs Tutorial

Wed., Nov. 7:

Lecture: HMMs, Bayes Nets - Structure Learning [Slides] [Annotated]
Readings: Additional Reading: Heckerman BN Learning Tutorial
Additional Reading: Tree-Augmented Naive Bayes paper

Unsupervised and semi-supervised learning (4 Lectures)

K-means (Applet: K-means)
Expectation Maximization (EM)
- for Mixture of Gaussians: Applet: Mixture of Gaussians
- for training Bayes nets
- for training HMMs
Combining labeled and unlabeled data
- EM
- reweighting labeled data
- Co-training
- unlabeled data and model selection
Dimensionality reduction (PCA, SVD) Applet: PCA
Feature selection

Mon., Nov. 12:

Lecture: BNs Structure learning, Clustering - K-means [Slides] [Annotated]
Readings: (Bishop 9.1, 9.2) - K-means, Mixtures of Gaussian

Wed., Nov. 14:

Guest Lecture: Online Learning (Avrim Blum) [Slides]

Mon., Nov. 19:

Lecture: EM [Slides] [Annotated]
Readings: (Bishop 9.3, 9.4) - EM
Neal and Hinton EM paper
Ghahramani, "An introduction to HMMs and Bayesian Networks"

Wed., Nov. 21:

NO CLASS: Thanksgiving

Mon., Nov. 26:

Lecture: EM (cont.) and Principal Component Analysis (PCA) [Slides] [Annotated]
Readings: Shlens' PCA tutorial
Optional reading: Wall et al. 2003 - PCA for gene expression data

Learning to make decisions (2 Lectures)

Markov decision processes
Reinforcement learning

Wed., Nov. 28:

Lecture: Markov Decision Processes (MDPs) [Slides] [Annotated]
Readings: Kaelbling et al. Reinforcement Learning tutorial

Special date/time: Thursday, Nov. 29th, 5-6:20pm in Wean 7500:

Lecture: Reinforcement Learning [Slides] [Annotated]
Readings: Brafman and Tennenholtz: Rmax paper

Fri., Nov. 30:

Project Poster Session

2-5pm, Newell-Simon Hall Atrium

Final Exam

Tuesday, Dec. 11, 5:30-8:30PM

Location TBA

Project Paper Due

2pm, Friday, Dec. 14

Machine Learning Fall 2007

10-701 and 15-781

Carlos Guestrin

School of Computer Science, Carnegie Mellon University

Homework 1 is due at the beginning of lecture.

Project Poster Session

Final Exam

Project Paper Due