CMU 10-806 Foundations of Machine Learning and
Data Science, Fall 2015
Mon/Wed 4:30-5:50, GHC 4303
Course description: This course will cover fundamental
topics in Machine Learning and Data Science, including powerful
algorithms with provable guarantees for making sense of and
generalizing from large amounts of data. The course will start by
providing a basic arsenal of useful statistical and computational
tools, including generalization guarantees, core algorithmic
methods, and fundamental analysis models. We will examine
questions such as: Under what conditions can we hope to
meaningfully generalize from limited data? How can we best combine
different kinds of information such as labeled and unlabeled data,
leverage multiple related learning tasks, or leverage multiple
types of features? What can we prove about methods for summarizing
and making sense of massive datasets, especially under limited
memory? We will also examine other important constraints and
resources in data science including privacy, communication, and
taking advantage of limited interaction. In addressing these and
related questions we will make connections to statistics,
algorithms, linear algebra, complexity theory, information theory,
optimization, game theory, and empirical machine learning
research. [More info] [People and office hours]
Homeworks
Take-home final
You can
take the test in any 24-hour period you want up unil Fri Dec 18
(i.e., midnight Dec 18 is the latest hand-in date).
Here is the take-home final.
Project poster presentations will be Thursday December 17, 4-6pm in the GHC
7th floor Atrium. If you cannot make that time, please come talk
with us.
Writeups due by Sunday December 20.
Lecture Notes & Handouts
- 09/09: Introduction. The consistency model.
See also Chapter 1 in the Kearns-Vazirani book.
- 09/14: The PAC model for passive
supervised learning.
See also Chapter 1 in the Kearns-Vazirani book.
- 09/16: Effective number of
hypotheses, VC-dimension, and Sauer's lemma.
See also Chapter 3 in the Kearns-Vazirani book.
- 09/21: Sample complexity results
for infinite hypothesis spaces.
See also Chapter 3 in the Kearns-Vazirani book.
- 09/23: Sample complexity results
for infinite hypothesis spaces (cont'd).
See also Chapter 3 in the Kearns-Vazirani book.
- 09/28: Sample complexity lower
bounds for passive supervised learning.
See also Chapter 3.6 in the Kearns-Vazirani book.
- 09/30: Sample Complexity results for
the agnostic case.
- 10/05: Generalization bounds based
on Rademacher complexity.
See also Chapter 3 in the Mohri, Rostamizadeh, and Talwalkar
book.
See also the survey Introduction
to Statistical Learning Theory by O. Bousquet, S.
Boucheron, and G. Lugosi.
See also the survey Theory
of Classification: A Survey of some recent advances by O.
Bousquet, S. Boucheron, and G. Lugosi.
- 10/07: Computational hardness results.
See also Ch. 1.4, 1.5, and 6.1 in the Kearns-Vazirani book. More
resources on NP-hardness: 1,
2.
- 10/12: Online learning and
optimization I: mistake-bounds and combining expert advice.
Further readings:
book
chapter
- 10/14: Online learning and
optimization II: ERM and Follow the Regularized Leader.
See also Shalev-Shwartz
monograph
- 10/19: Online learning and
optimization III: FTRL contd, and Follow the Perturbed Leader.
- 10/21: Online learning and optimization IV: FPL contd, and the multi-armed bandit setting.
- 10/26: Boosting:
weak-learning, strong-learning, and adaboost.
See also Chapter 4 in the Kearns-Vazirani book and Chapter 6 in
the Mohri-Rostamizadeh-Talwalkar book.
See also
Rob Schapire's notes.
- 10/28: Learning and game theory.
- 11/02: Learning and game theory.
See also this book
chapter
- 11/04: Streaming Algorithms:
estimating frequency counts, the count-min sketch, begin
distinctness counting.
See also Muthukrishnan
lecture notes, Chakrabarti
lecture notes
- 11/09: Streaming Algorithms II:
distinctness counting, frequency moments.
- 11/11: The Johnson Lindenstrauss
Lemma and tensor methods.
See also Moitra
notes, Dasgupta
notes
- 11/16: Foundations of Active Learning Intro slides and Lecture Notes.
See also the survey Two
Faces of Active Learning by S. Dasgupta.
See also the survey Active
Learning Survey by Balcan and Urner.
- 11/18: Disagreement Based Active
learning.
See also the survey Theory
of Disagreement-Based Active Learning by S. Hanneke.
- 11/23: Active Learning of Linear
Separators and Slides.
- 11/30: Semi-Supervised
Learning.
See also this survey article
- 12/02: Semi-Supervised
Learning and brief discussion
of multilayer networks.
- 12/07: Differential privacy and
Statistical Query learning and Slides.
- 12/09: Distributed learning
See also: Jordan and Mitchell, Machine learning: Trends, Perspectives, and Prospects.