Machine Learning, 15:681 and 15:781, Fall 1998

Professor Tom M. Mitchell School of Computer Science, Carnegie Mellon University

Machine Learning is concerned with computer programs that automatically improve their performance through experience. This course covers the theory and practice of machine learning from a variety of perspectives. We cover topics such as learning decision trees, neural network learning, statistical learning methods, genetic algorithms, Bayesian learning methods, explanation-based learning, and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC and Mistake-bound learning frameworks, minimum description length principle, and Occam's Razor. Programming assignments include hands-on experiments with various learning algorithms. Typical assignments include neural network learning for face recognition, and decision tree learning from databases of credit records.

Class lectures: Tues & Thurs 12:00-1:20, Wean Hall 5403

Optional recitation section:

Mon 12:30-1:30 beginning Sept 21, Wean Hall 5409

Instructor:

Tom Mitchell, Wean Hall 5309, x8-2611, Office hours: Wed 3:00-4:00

Teaching Assistants:

Leemon Baird,Wean Hall 4207, x8-3728, Office hours: 3:00-4:00 thursdays

Dimitris Margaritis,Wean Hall 8122, x8-3070, Office hours: 3:00-4:00 tuesdays

Course Secretary:

Jean Harpley, Wean Hall 5313, x8-3802

Textbook:

Machine Learning, Tom Mitchell, McGraw Hill, 1997.

Course Website:

www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-3/www/ml.html

Course Newsgroup:

cmu.cs.class.cs681

Course Projects (15-781 only):

This course is offered as both an upper-level undergraduate course (15-681), and a graduate level course (15-781). Ph.D. students registering for 15-781 will be expected to do an extra course project due at 5pm December 7, 1998.

suggested course projects . Project proposals are due in class on Oct 29.
what to turn in for your final project writeup .

Grading:

15-681: Will be based on homeworks (35%), midterm (30%), and final (35%).
15-781: Will be based on homeworks (17.5%), course project (17.5%), midterm (30%), and final (35%).

Policy on late homework:

Homework is worth full credit at the beginning of class on the due date,
It is worth half credit for the next 48 hours,
It is worth zero credit after that.
You must turn in all but two assignments, even if for zero credit
Free exemption: We will ignore your lowest homework grade for the semester.

Homework assignments (postscript as available)

Assignment 1. Version spaces, PAC learning. Handed out Sept 15, Due Sept 22. (LaTex source)
Assignment 2. Decision tree learning. Handed out Sept 24, Due Oct 1. (LaTeX source).
Assignment 3. Neural network learning for face recognition. Handed out Oct 6, Due Oct 15. (LaTex source)
Assignment 4. Statistical estimation, Bayesian methods. Handed out Nov 10, Due Nov 17. (LaTex source)
Assignment 5. Genetic algorithms, Lazy learning, RBFs, Reinforcement learning. Handed out Nov 24, Due friday Dec 4. (LaTex source)

Lecture plan (and postscript slides when available).

Aug 25, 1998. Overview of learning (optional lecture). (Read Chapter 1 (not optional :-))
Sept 15. Concept learning, version spaces (ch. 2)
Sept 17. Inductive bias, PAC learning (ch. 2, 7 up through 7.3)
Sept 22. PAC learning, VC dimension, Mistake bounds (ch. 7.4 through 7.4.3, 7.5 through 7.5.3) (lecture slides same as Sept 17 lecture)
Sept 24. Decision trees (ch. 3)
Sept 29. Decision trees, overfitting, Occam's razor (ch. 3)
Oct 1. Neural networks (ch. 4)
Oct 6. Neural networks (ch. 4)
Oct 8. Estimation and confidence intervals (ch. 5) Guest lecture: Prof. Larry Wasserman, Professor of Statistics, CMU
Oct 13. Bayesian learning: MAP and ML learners (ch. 6)
Oct 15. Bayesian learning: MDL, Bayes Optimal Classifier, Gibbs sampling (ch. 6)
Oct 20. Naive Bayes and learning over text (ch. 6)
Oct 22. Bayes nets (ch6)
Oct 27. Midterm exam. open notes, open book. Results: midterm histograms for 15-681 and 15-781.
Oct 29. EM and Combining labeled with unlabeled data (ch 6)
Nov 3. Combining Learned Classifiers, Weighted Majority, Bagging (Ch 7: Weighted majority)
Nov 5. Biological learning. Guest lecture: Prof. Jay McClelland, Director of Center for the Neural Basis of Cognition, CMU (see papers handed out)
Nov 10. Boosting , Genetic algorithms, genetic programming (ch. 9)
Nov 12. More on Genetic Programming (ch. 9)
Nov 17. Instance based learning, k nearest nbr., locally weighted regression, Radial basis functions (ch. 8, through 8.4).
Nov 19. Support Vector Machines (see also Burges' SVM tutorial )
Nov 24. Learning Rules, Inductive Logic Programming (ch. 10)
Dec 1. Reinforcement learning I (ch. 13)
Dec 3. Reinforcement learning II (ch. 13)
Dec 14. FINAL EXAM

Note to people outside CMU

Feel free to use the slides and materials available online here. Please email Tom.Mitchell@cmu.edu with any corrections or improvements. Additional slides are available at the Machine Learning textbook homepage.