Machine Learning

10-701/15-781, Fall 2010

Aarti Singh

Home

People

Lectures

Recitations

Homeworks

Project

Previous material

Table of algorithms

Date	Lecture	Topics	Readings and useful links	Handouts
Sept 8	Intro to ML Slides	ML applications What consitutes an ML algorithm? Learning paradigms, Loss functions Supervised learning (classification, regression) Unsupervised learning (density estimation, clustering, dimensionality reduction) Bayes Optimal Learning Rule	Bishop: Sec 2.1, Appendix B Mithcell: Ch 1
Sept 13	Learning distributions Slides	Learning parametric distributions Maximum Likelihood Estimation (MLE) Maximum A Posterior (MAP) Estimation	Andrew Moore's Basic Probability Tutorial Bishop: Sec 2.2, 2.3 (up to 2.3.6)	HW1 is out
Sept 15	Optimal Classifier Slides	MLE vs. MAP Bayes Optimal Classifier	Bishop: Sec 1.5
Sept 20	Naive Bayes Slides	Conditional Independence Naive Bayes Classifer Discrete Features Continuous Features	Mitchell's Chapter Draft
Sept 22	Logistic regression Slides	Generative vs. Discrimiative Classifiers Logistic regression	Mitchell's Chapter Draft Bishop: Sec 4.1-4.3 On Discriminative and Generative Classifiers, Ng and Jordan, NIPS, 2001 (pdf) On gradient descent and Newton's method: Boyd's slides and Chapter 9 of Convex Optimization.
Sept 27	Regression Slides	Linear Regression Polynomial Regression	Least Squares Applet Tutorial on regression by Andrew Moore Bishop: Sec 3.1	HW1 due
Sept 29	Nonparametric methods Slides	Histogram, Kernel Density Estimation K-NN Classifier Kernel Regression	Bishop: Sec 2.5, 6.3 Mitchell: Ch 8 Tutorial on Instance-based Learning by Andrew Moore	HW2 is out
Oct 4	Model Selection Slides	Overfitting Bias-Variance Tradeoff Model Selection Cross-validation Structural Risk Minimization Complexity Regularization Information Criteria (AIC, BIC, MDL)	Bishop: Sec 1.3, 3.1.4 Hastie: Ch 7 (recommended) A study of CV and Bootstrap (optional) MDL website (optional) Model Selection and MDL principle paper by M. Hansen and B. Yu (optional)
Oct 6	Decision Trees Slides	Decision Tree Representation Entropy, Information gain Overfitting, Pre-and Post-pruning, MDL	Mitchell: Ch 3 Decision Tree Applet
Oct 11	Boosting Slides	Combining weak classifiers Adaboost algorithm Comparison with logistic regression and bagging	Bishop: Sec 14.3 Boosting homepage Schapire: Boosting Tutorial, Video Adaboost Applet	Project Proposal due
Oct 13	Support Vector Machines Slides	Maximizing margin SVM formulation Slack variables, Hinge loss Multi-class SVM	Bishop: Sec 7.1, Sec 4.1.1, 4.1.2, Appendix E Stephen Boyd's book: Ch 5 (optional)	HW2 due HW3 is out
Oct 18	Suuport Vector Machines Slides	Constrained Optimization Dual SVM Kernel Trick Comparison with Kernel regression and Logistic Regression	Bishop: Sec 6.1, 6.2 Tutorials on SVMs and Kernels Additional resource: SVM website
Oct 20		Midterm Exam	Score distribution	Exam Solution
Oct 25	Clustering Slides	What is clustering? Hierarhical Clustering Single linkage Complete linkage Average linkage Partition based Clustering K-means algorithm	Bishop: Sec 9.1
Oct 27	EM Algorithm Slides	Gaussian Mixture Model Expectation Maximization Algo	Bishop: Ch 9
Nov 1	Learning Theory I Slides Annotated Slides	Sample complexity Haussler bound PAC Learning Hoeffding's bound	Mitchell: Ch 7	HW3 due HW4 is out
Nov 3	Learning Theory II Slides	VC dimension Mistake Bounds	Mitchell: Ch 7
Nov 8	HMM Slides	HMM Representation Forward Algorithm Forward-Backward Algorithm Viterbi Algorithm Baum-Welch Algorithm	Bishop: Ch 13 HMM and EM Tutorial	Midterm project report due
Nov 10	Graphical Models I Slides	Representation - Directed models Factorization of joint distrubtion Local Markov Assumption D-separation Representation Theorem	Bishop: Ch 8 Graphical Models tutorial by M. Jordan Intro to Graphical Models by K. Murphy
Nov 15	Graphical Models II Slides	Representation - Undirected models Factorization of joint distribution Graph separation Hammersley-Clifford Theorem Inference Variable Elimination	Bishop: Ch 8 Graphical Models tutorial by M. Jordan Intro to Graphical Models by K. Murphy	HW4 due
Nov 17	Graphical Models III Dimensionality Reduction Slides	Learning - Graphical Models Learning CPTs Learning structure - Chow-Liu Algorithm Dimensionality Reduction Feature Selection PCA (Principal Components Analysis)		HW5 is out
Nov 22	Nonlinear Dim Red Slides Spectral Clustering Slides	Laplacian Eigenmaps Spectral Clustering	Belkin-Niyogi Paper on Laplacian Emaps Spectral Clustering tutorial by Ulrike von Luxburg Spectral Clustering demo
Nov 29	Neural Networks Slides	Neural Networks Prediction - Forward Propagation Training - Backpropagation	Derivation of Backpropagation (pdf)
Dec 1	Semi-Supervised Learning Slides
Dec 2		Project Poster Presentation (3-6 pm NSH Atrium)
Dec 7		Final Project report due (by 10:30 am)	Both project report and HW5 are due by 10:30 am in Michelle's office (GHC 8001)	HW5 due (by 10:30 am)
Dec 14		Final Exam (1-4 pm), DH 2210