Machine Learning

10-601, Fall 2011

Carnegie Mellon University

Tom Mitchell, Aarti Singh

Previous material

Date	Lecture	Topics	Readings and useful links
Sept 13 slides	Intro to ML Decision Trees	Machine learning examples Well defined machine learning problem Decision tree learning	Required: Mitchell: The Discipline of Machine Learning Bishop Ch. 14.4
Sept 15 slides	Decision Tree Learning Review of Probability	The big picture Overfitting Random variables, probabilities	Required: Bishop Ch.1 thru 1.2.3 Bishop Ch.2 thru 2.2 Optional: Moore: Basic Probability Tutorial Mitchell VIDEO: Probability, MLE, MAP
Sept 20 slides	Probability and Estimation	Probability review Bayes rule MLE	Required: Bishop Ch.1 thru 1.2.3 Bishop Ch.2 thru 2.2 Optional: Moore: Basic Probability Tutorial Mitchell VIDEO: Probability, MLE, MAP
Sept 22 slides	Naive Bayes MAP estimates	Conditional independence Naive Bayes	Required: Mitchell: Naive Bayes, Logistic Regression Optional: Mitchell VIDEO: Naive Bayes
Sept 27 slides	Naive Bayes MAP estimates	MAP estimates, Conjugate priors Document classification	Required: Mitchell: Naive Bayes, Logistic Regression Optional: Bishop: chapters 1.2.4, 4.2 Mitchell VIDEO: Gaussian Naive Bayes
Sept 29 slides	Gaussian Naive Bayes Logistic Regression	Gaussian Naive Bayes Brain image classification Logistic Regression Gradient ascent	Required: Mitchell: Naive Bayes, Logistic Regression Bishop: Chapter 1.2.4 Optional: Ng & Jordan: On Discriminative and Generative Classifiers, NIPS, 2001. Mitchell VIDEO: Logistic regression
Oct 4 slides	Logistic Regression Generative/Discriminative	Logistic regression regularization and MAP estimation	Required: Bishop: Chapter 1.2.5 Bishop: Chapter 3 through 3.2
Oct 6 slides	Linear regression	linear regression polynomial regression bias-variance decomposition	Optional: Mitchell VIDEO: Linear regression (min. 47:59)
Oct 11 slides	Graphical Models 1	Bayes nets Representing joint distributions with conditional independence assumptions D-separation and conditional independence	Required: Bishop: Ch 8, through 8.2 Optional: Mitchell VIDEO: Bayes nets I Murphy Intro. to Graphical Models Jordan Graphical Models tutorial
Oct 13 slides	Graphical Models 2	D-separation Inference	Required: Bishop: Ch 8, through 8.2 Optional: Mitchell VIDEO: Bayes nets II
Oct 18 slides	Graphical Models 3	EM Mixture of Gaussians clustering Learning Bayes Net structure - Chow Liu	Required: Bishop: 9.2 Bilmes: Section 1 through 3: EM and HMM tutorial Optional: Bishop: 9.3.3, 9.4 Mitchell VIDEO: Bayes nets III Mitchell VIDEO: Bayes nets IV
Oct 20 slides	Computational Learning Theory 1	PAC Learning	Optional: Mitchell: Chapter 7 Mitchell VIDEO: PAC learning
Oct 25 PAC learning slides Midterm Review slides	Computational Learning Theory 2	PAC Learning VC Dimension Midterm review	Optional: Mitchell: Chapter 7 Mitchell VIDEO: PAC learning
Oct 27	Midterm	Open book, Open notes, No computers	Midterm Exam with Solutions Mid-Semester Grades Histogram
Nov 1 slides	Hidden Markov Models	Markov models HMM's and Bayes Nets Other probabilistic time series models	Required: Bishop: 13.1, 13.2 HMM and EM Tutorial
Nov 3 slides	Neural Networks	Non-linear regression Backpropagation and gradient descent Learning hidden layer representations	Derivation of Backpropagation Mitchell Ch. 4 Bishop Ch. 5
Nov 8 slides	Learning Representations 1	Feature Selection Principal Component Analysis (PCA)	Bishop Ch. 12 through 12.1 A Tutorial on PCA, J. Schlens
Nov 10 slides	Learning Representations 2	SVD ICA Laplacian Eigenmaps k-means and spectral clustering	SVD and PCA, Wall et al. Spectral Clustering tutorial Spectral Clustering demo
Nov 15 slides	Nonparametric methods	Histogram and Kernel density estimation k-NN Classifier Kernel Regression	Bishop: Sec 2.5, 6.3 Mitchell: Ch 8 Tutorial on Instance-based Learning by Andrew Moore Matlab demo files
Nov 17 slides	Support Vector Machines 1	Maximizing margin SVM formulation Slack variables, hinge loss Multi-class SVM	Bishop: Sec 7.1, Sec 4.1.1, 4.1.2, Appendix E
Nov 22 slides	Support Vector Machines 2	Constrained optimization Dual SVM Kernel Trick Comparison with Kernel regression and logistic regression	Bishop: Sec 6.1, 6.2 Tutorials on SVMs and Kernels
Nov 29 slides	Boosting	Combining weak classifiers Adaboost algorithm Comparison with logistic regression and bagging	Bishop: Sec 14.3 Boosting homepage Schapire: Boosting Tutorial, Video Adaboost Applet
Dec 1 slides	Semi-supervised Learning	Generative Methods Graph-based Methods Multi-view Methods	SSL survey
Dec 6 slides	Active Learning	Binary Bisection Uncertainty sampling Query-by-Committee	Active Learning Survey
Dec 8 slides	Review
Dec 16, 5:30 - 8:30 PM	Final Exam	Open book, Open notes, No computers