Introduction to Machine Learning (SCS majors)
Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the core concepts, theory, algorithms and applications of machine learning.
After completing the course, students should be able to:
This course is designed for SCS undergraduate majors. It covers many similar topics to other introductory machine learning course, such as 10-301/10-601 and 10-701. This 10-315 course and 15-281 AI Representation and Problem Solving are designed to complement each other and provide both breadth and depth across AI and ML topics. Contact the instructor if you are concerned about which machine learning course is appropriate for you.
The prequisites for this course are:
While not explicitly a prerequisite, we will be programming exclusively in Python. Please see the instructor if you are unsure whether your background is suitable for the course.
Pat would very much like to help you all as much as possible. In addition to standing office hourse, he often have "OH" (or "Open") appointment slots on his office hours appointment calendar. If no there are no available OH or appointments that meet your needs, please contact Pat via a private post on Piazza with a list of times that work for you to meet.
Subject to change
Bishop, Christopher. Pattern Recognition and Machine Learning, available online, (optional)
Daumé III, Hal. A Course in Machine Learning, available online
(DL) Goodfellow, Ian, Yoshua Bengio, Aaron Courville. Deep Learning, available online, (optional)
(MML) Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong. Mathematics for Machine Learning, available online
Mitchell, Tom. Machine Learning, available online
Murphy, Kevin P. Machine Learning: A Probabilistic Perspective, available online, (optional)
(KMPA) Shaw-Taylor, John, Nello Cristianini. Kernel Methods for Pattern Analysis, available online, (optional)
Dates | Topic | Lecture Materials | Pre-Reading | Reading (optional) |
---|---|---|---|---|
1/16 Mon | No class: MLK Day | MML 2.1-3, 2.5, 2.6 and 3.1, 3.2.1, 3.3 | ||
1/18 Wed | Introduction |
10315 Notation Guide.pdf pptx (inked) pdf (inked) |
Mitchell 1.1-1.2 | |
1/23 Mon | Decision Trees | pptx (inked) pdf (inked) | Daumé 1 | |
1/25 Wed | Decision Trees K-Nearest Neighbor and Model Selection |
pptx
(inked)
pdf
(inked)
pptx (inked) pdf (inked) |
Daumé 2,
Daumé 3 Entropy, Cross-Entropy video, A. Géron |
|
1/30 Mon | K-NN and Model Selection (cont.) |
See previous lecture slides |
Pre-reading: ipynb or pdf Checkpoint 1 due 1/30 Mon morning, 10 am |
MML 8.2.4, 8.3.3 |
2/1 Wed | Optimization and Linear Regression |
pptx
(inked)
pdf
(inked)
regression interactive.ipynb regression blind interactive.ipynb |
MML 8.2-8.2.2 MML 5.2-5.5 |
|
2/6 Mon | Optimization and Linear Regression (cont.) Feature Engineering |
See previous lecture slides pptx (inked) pdf (inked) |
||
2/8 Wed | Logistic Regression |
pptx
(inked)
pdf
(inked) linear logistic.ipynb quadratic logistic.ipynb Multiclass logistic Desmos |
Bishop 4.1.3, 4.3.2, 4.3.4 | |
2/13 Mon | Neural Networks |
pptx
(inked)
pdf
(inked) Convex functions Desmos |
MML 5.6 DL 6 |
|
2/15 Wed | Neural Networks |
See previous lecture slides Universal network Desmos |
||
2/20 Mon | Regularization | pptx (inked) pdf (inked) |
DL 7.1,7.8 Bishop 3.1.4 |
|
2/22 Wed | MLE and Probabilistic Modeling |
pptx
(inked)
pdf
(inked) MLE notes (draft; last section added soon!): pdf |
MML 9 Bishop 1.2.4-5, 3.1.1-2 |
|
2/27 Mon | MLE and Probabilistic Modeling | See previous lecture slides | ||
3/1 Wed | EXAM 1 In-class |
Learning objectives: pdf Practice problems: pdf (sol) |
||
3/6 Mon | No class: Spring Break | |||
3/8 Wed | No class: Spring Break | |||
3/13 Mon | Neural Net Applications | In-progress: pptx pdf | DL 9 | |
3/15 Wed | Neural Net Applications (cont.) | See previous lecture slides | ||
3/20 Mon | MAP | pptx (inked) pdf (inked) | Pre-reading: MLE notes (draft; last section added soon!) pdf | Mitchell MLE and MAP |
3/22 Wed | Probabilistic Generative Models and Naive Bayes |
pptx
(inked)
pdf
(inked)
discriminant analysis.ipynb |
Mitchel Generative and Discriminative Classifiers Murphy 3.5, 4.2, 8.6 |
|
3/27 Mon | Probabilistic Generative Models and Naive Bayes (cont.) | See previous lecture slides | ||
3/29 Wed | Dimensionality Reduction: PCA, Autoencoders, Feature Learning | pptx (inked) pdf (inked) | Bishop 12.1, Murphy 12.2 | |
4/3 Mon | Recommender Systems | pptx (inked) pdf (inked) | Matrix Factorization Techniques for Recommender Systems. Koren, Bell, and Volinsky (2009) | |
4/5 Wed | Clustering, K-means | pptx (inked) pdf (inked) | Bishop 9.1, Murphy 25.5 | |
4/10 Mon | Gaussian Mixture Models, EM Algorithm |
pptx (inked) pdf (inked) | Bishop 9.2 | |
4/12 Wed | Nonparametric Regression, Kernels | pptx (inked) pdf (inked) | KMPA 7.3, 7.3.2 | |
4/17 Mon | SVMs, Duality | |||
4/19 Wed | Learning Theory, PAC | pdf (inked) | ||
4/24 Mon | Learning Theory, VC Dimensions | |||
4/26 Wed | EXAM 2 In-class |
Learning objectives: pdf Practice problems: pdf (sol) |
Recitation starts the first week of class, Friday, Jan. 20. Recitation attendence is recommended to help solidfy weekly course topics. That being said, the recitation materials published below are required content and are in-scope for midterms 1 and 2. Students frequently say that the recitations are one of the most important aspects of the course.
Recitation section assignments will be locked-down after the third week. Until then, you may try attending different recitation sections to find the best fit for you. In the case of any over-crowded recitation sections, priority goes to students that are officially registered for that section in SIO. The process to select your final recitation assignment will be announced on Piazza as we get closer to Recitation 4.
Recitations will be on Fridays in the following individual recitation sections:
Section | Time | Location | TAs | Resources |
---|---|---|---|---|
A+B | Friday 1:00 pm - 1:50 pm | Hall of Arts 160 | Meher, Deep | Drive folder |
C | Friday 11:00 am - 11:50 am | WEH 2302 | Shreeya, Medha | Drive folder |
D | Friday 9:00 am - 9:50 am | GHC 4102 | Saloni, Alex | Drive folder |
E | Friday 10:00 am - 10:50 am | DH 1112 | Devanshi, Arya | Drive folder |
F | Friday 11:00 am - 11:50 am | DH 1112 | Saumya, Ruthie | Drive folder |
Dates | Recitation | Handout/Code |
---|---|---|
1/20 Fri | Recitation 1: NumPy |
Reference: NumPy_Tutorial_from_11-785.ipynb visualizing_data_1.ipynb visualizing_data_2.ipynb visualizing_data_3.ipynb indexing_trick.ipynb messing_with_mnist.ipynb |
1/27 Fri | Recitation 2: Decision Trees and K-NN | pdf (solution) and kNN.ipynb |
2/3 Fri | Recitation 3: Matrix Calculus and Linear Regression | pdf (solution) |
2/10 Fri | Recitation 4: Logistic Regression | pdf (solution) |
2/17 Fri | Recitation 5: Neural Networks | pdf (solution) |
2/24 Fri | Recitation 6: Regularization, Prob/Stat/MLE | pdf (solution) |
3/3 Fri | No recitation | |
3/10 Fri | No recitation | |
3/17 Fri | Recitation 7: PyTorch, Convnets, MAP |
PyTorch Overview Slides PyTorch Tutorial Notebook.ipynb Worksheet: pdf (solution) |
3/24 Fri | Recitation 8: Generative Models |
pdf
(solution) discriminant analysis.ipynb |
3/31 Fri | Recitation 9: Generative+MAP, PCA | pdf (solution) |
4/7 Fri | Recitation 10: Recommender Systems, Clustering | pdf (solution) |
4/14 Fri | Recitation 10.5: Kernel Regression (no in-person recitation) | pdf (solution) |
4/21 Fri | Recitation 11 | pdf (solution) |
4/28 Fri | Recitation 12 |
The course includes two midterm exams. The midterms will be 12:30-1:50 pm on Mar. 1 and Apr. 26. Both will take place in class. Plan any travel around exams, as exams cannot be rescheduled.
A mini-project due during the final exam period. This will be an opportunity to work with a team and apply machine learning concepts from class to a project that is more customized to your interests. More details about the mini-project details and deadlines will be announce later in the semester.
There will be approximately six homework assignments that will have written and programming components and approximately five online assignments (subject to change). Written and online components will involve working through algorithms presented in the class, deriving and proving mathematical results, and critically analyzing material presented in class. Programming assignments will involve writing code in Python to implement various algorithms.
For any assignments that aren't released yet, the dates below are tentative and subject to change.
Assignment | Link (if released) | Due Date |
---|---|---|
HW 1 (programming) | hw1.ipynb | 1/29 Sun, 11:59 pm |
HW 2 (online) | Gradescope | 2/4 Sat, 11:59 pm |
HW 3 (written/programming) | hw3_blank.pdf, hw3_tex.zip, hw3.ipynb | 2/9 Thu, 11:59 pm |
HW 4 (online) | Gradescope | 2/16 Thu, 11:59 pm |
HW 5 (written/programming) | hw5_blank.pdf, hw5_tex.zip, hw5.ipynb | 2/23 Thu, 11:59 pm |
HW 6 (online) | Gradescope | 3/24 Fri, 11:59 pm |
HW 7 (written/programming) | hw7_blank.pdf, hw7_tex.zip, hw7.ipynb | 4/1 Sat, 11:59 pm |
HW 8 (online) | Gradescope | 4/9 Sun, 11:59 pm |
HW 9 (online) | Gradescope | 4/17 Mon, 11:59 pm |
HW 10 (written/programming) | hw10_blank.pdf, hw10_tex.zip, hw10.ipynb | 4/22 Sat, 11:59 pm |
Grades will ultimately be collected and reported in Canvas.
Final scores will be composed of:
This class is not curved. However, we convert final course scores to letter grades based on grade boundaries that are determined at the end of the semester. What follows is a rough guide to how course grades will be established, not a precise formula — we will fine-tune cutoffs and other details as we see fit after the end of the course. This is meant to help you set expectations and take action if your trajectory in the class does not take you to the grade you are hoping for. So, here's a rough heuristics about the correlation between final grades and total scores:
This heuristic assumes that the makeup of a student's grade is not wildly anomalous: exceptionally low overall scores on exams, programming assignments, or written assignments will be treated on a case-by-case basis.
Precise grade cutoffs will not be discussed at any point during or after the semester. For students very close to grade boundaries, instructors may, at their discretion, consider participation in lecture and recitation, exam performance, and overall grade trends when assigning the final grade.
Pre-reading checkpoints don't have any extensions or late days. However, the lowest two checkpoints will be dropped when computing your semester score. Reasoning: We want to make sure that everyone is able to complete the pre-reading prior to lecture, so we can build on that knowledge in class; minor illness and other minor disruptive events outside of your control happen occasionally and thus dropping the lowest two scores. See below for information on rare exceptions.
You have a pool of 6 slip days across all written/programming and online assignment types
Aside from slip days, dropping the lowest checkpoints, and the 80% threshold for participation, there will be no extensions on assignments in general. If you think you really really need an extension on a particular assignment, e-mail Joshmin, joshminr@andrew.cmu.edu, as soon as possible and before the deadline. Please be aware that extensions are entirely discretionary and will be granted only in exceptional circumstances outside of your control (e.g., due to severe illness or major personal/family emergencies, but not for competitions, club-related events, or interviews). The instructors will require confirmation from University Health Services or your academic advisor, as appropriate.
We certainly understand that unfortunate things happen in life. However, not all unfortunate circumstances are valid reasons for an extension. Nearly all situations that make you run late on an assignment homework can be avoided with proper planning - often just starting early. Here are some examples:
You are encouraged to read books and other instructional materials, both online and offline, to help you understand the concepts and algorithms taught in class. These materials may contain example code or pseudo code, which may help you better understand an algorithm or an implementation detail. However, when you implement your own solution to an assignment, you must put all materials aside, and write your code completely on your own, starting “from scratch”. Specifically, you may not use any code you found or came across. If you find or come across code that implements any part of your assignment, you must disclose this fact in your collaboration statement.
Students are responsible for pro-actively protecting their work from copying and misuse by other students. If a student’s work is copied by another student, the original author is also considered to be at fault and in gross violation of the course policies. It does not matter whether the author allowed the work to be copied or was merely negligent in preventing it from being copied. When overlapping work is submitted by different students, both students will be punished.
Do not post your solutions publicly, neither during the course nor afterwards.
Violations of these policies will be reported as an academic integrity violation and will also result in a -100% score on the associated assignment/exam. Information about academic integrity at CMU may be found at https://www.cmu.edu/academic-integrity. Please contact the instructor if you ever have any questions regarding academic integrity or these collaboration policies.
(The above policies are adapted from 10-601 Fall 2018 and 10-301/601 Spring 2023 course policies.)
If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to visit their website.
Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, getting enough sleep, and taking some time to relax. This will help you achieve your goals and cope with stress.
All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is almost always helpful.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.
If you have questions about this or your coursework, please let us know. Thank you, and have a great semester.