Introduction to Maching Learning
Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as Bayesian networks, decision tree learning, support vector machines, statistical learning methods, unsupervised learning and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, and margin-based learning. Programming assignments include hands-on experiments with various learning algorithms. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.
After completing the course, students should be able to:
10-301 and 10-601 are identical. Undergraduates must register for 10-301 and graduate students must register for 10-601. This course covers many similar topics to other introductory machine learning and A.I. courses, such as 10-315, 10-701, 15-281. Contact the instructor if you are concerned about which course is appropriate for you.
Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate. In addition, recitation sessions will be held to review some basic concepts.
You must strictly adhere to these pre-requisites! Even if CMU’s registration system does not prevent you from registering for this course, it is still your responsibility to make sure you have all of these prerequisites before you register.
Notably missing in this prerequisite list is any linear algebra course. Linear algebra is indeed a central piece to this machine learning course. Given the lack of a linear algebra prerequisite, we will provide the necessary resources and instruction for linear algebra. That being said, if you have never been exposed to matricies and vectors in any context, please contact the instructor to discuss how to best meet your linear algebra needs.
Please see the instructor if you are unsure whether your background is suitable for the course.
See office hours on the calendar below.
When appropriate, this course uses the CMU OHQueue tool as a queueing system for office hours.
Feel free to contact the course staff via a private post on Piazza to request office hours by appointment. We'll do our best to accommodate these requests. We also occasionally open 15-minute appointment slots on the course calendar. These appointment slots may be secured via the following link. Please be courteous to other students when selecting these slots, e.g. don't select more than one slot per day. Link: OH Appointment Slots
Daumé III, Hal. A Course in Machine Learning, available online, (optional)
Bishop, Christopher. Pattern Recognition and Machine Learning, available online, (optional)
Murphy, Kevin P. Machine Learning: A Probabilistic Perspective, available online, (optional)
Goodfellow, Ian, Yoshua Bengio, Aaron Courville. Deep Learning, available online, (optional)
Mitchell, Tom. Machine Learning, select chapters available online, (optional)
Dates | Topic | Reading / Demo | Slides / Notes |
---|---|---|---|
8/31 Mon | 1: Introduction to ML | notation.pdf | pptx (inked) pdf (inked) |
9/1 Wed | 2: Decision Trees | Daumé 1 |
pptx (inked) pdf (inked) |
9/7 Mon | No class: Labor Day | ||
9/9 Wed | 3: Decision Trees | Daumé 2, Entropy, Cross-Entropy video, A. Géron | pptx (inked) pdf (inked) |
9/14 Mon | 4: Decision Trees and Nearest Neighbor | Daumé 3 | pptx (inked) pdf (inked) |
9/16 Wed | 5: Nearest Neighbor and Model Selection | pptx (inked) pdf (inked) | |
9/21 Mon | 6: Linear Regression | Murphy 7.1-7.3 | pptx (inked) pdf (inked) |
9/23 Wed | 7: Optimization | pptx (inked) pdf (inked) | |
9/28 Mon | 8: Logistic Regression | Murphy 1.4.6, 8.1-8.3 lec8.ipynb |
pptx (inked) pdf (inked) |
9/30 Wed | 9: Logistic Regression | pptx (inked) pdf (inked) | |
10/2 Fri | 10: Logistic Regression and Feature Engineering | pptx (inked) pdf (inked) | |
10/5 Mon | MIDTERM EXAM 1, in-class | ||
10/7 Wed | 11: Feature Engineering and Regularization | Goodfellow, et al, Ch. 7.1, 7.8 | pptx (inked) pdf (inked) |
10/12 Mon | 12: Regularization and Neural Networks | lec12.ipynb | pptx (inked) pdf (inked) |
10/14 Wed | 13: Neural Networks | Goodfellow, et al, Ch. 6 | pptx (inked) pdf (inked) |
10/19 Mon | 14: Neural Networks | Goodfellow, et al, Ch. 9 | pptx (inked) pdf (inked) |
10/21 Wed | 15: Learning Theory | A Few Useful Things to Know about Machine Learning. Pedro Domingos (2012). | pptx (inked) pdf (inked) |
10/26 Mon | 16: Learning Theory | Generalization Abilities: Sample Complexity Results. Nina Balcan (2015). Lecture notes. | PAC Learning: Theorem 1
pptx (inked) pdf (inked) |
10/28 Wed | 17: MLE/MAP | Estimating Probabilities: MLE and MAP. Tom Mitchell (2018, draft) | pptx (inked) pdf (inked) |
11/2 Mon | 18: Generative Models and Naive Bayes | Generative and Discriminative Classifiers. Tom Mitchell (2020, draft). | pptx
(inked)
pdf
(inked)
SPAM handout (sol) |
11/4 Wed | 19: Bayes Nets | 15-281 Probability Reference Sheet | pptx (inked) pdf (inked) |
11/6 Fri | 20: HMMs | A
Tutorial on HMMs. Rabiner (1989). [Only pages 257 - 266] You may need to authenticate here first: CMU Libraries IEEE |
pptx (inked) pdf (inked) |
11/9 Mon | MIDTERM EXAM 2, in-class | ||
11/11 Wed | 21: HMMs | pptx (inked) pdf (inked) | |
11/16 Mon | 22: MDPs | Reinforcement Learning: A Survey. Kaelbling, et al (1996). | pptx (inked) pdf (inked) |
11/18 Wed | 23: MDPs | pptx (inked) pdf (inked) | |
11/20 Fri | 24: Reinforcement Learning | [Additional] Playing Atari with Deep Reinforcement Learning. Mnih, et al (2013). | pptx (inked) pdf (inked) |
11/23 Mon | Recitation | ||
11/25 Wed | No class: Thanksgiving | ||
11/30 Mon | 25: Support Vector Machines | Bishop
7.1 Daumé 11 |
pptx (inked) pdf (inked) |
12/2 Wed | 26: Support Vector Machines | pptx (inked) pdf (inked) | |
12/7 Mon | Clustering | Bishop 9.1 | pptx (inked) pdf (inked) |
12/9 Wed | Dimensionality Reduction | A Tutorial on Principal Component Analysis. Jonathon Shlens (2014). | pptx (inked) pdf (inked) |
12/14 Mon | FINAL EXAM | 8:30-11:30 am or 1:00-4:00 pm |
There will be three recitation sections on Fridays: 8:00 am, 11:40 am, and 3:20 pm. You may attend any one
of these sections.
Recitation will take place live using Zoom. See Canvas for Zoom links. Zoom sessions for recitation will
not be recorded.
Recitation attendance is recommended to help solidfy weekly course topics. That being said, the recitation materials published below are required content and are in-scope for midterm and final exams.
Dates | Recitation | Handout | Code/Demo |
---|---|---|---|
9/4 Fri | Recitation 1 | recitation1.pdf (solutions) | |
9/11 Fri | Recitation 2 | recitation2.pdf (solutions) | Jupyter Notebook, recitation2.py |
9/18 Fri | Recitation 3 | Numpy Notebook, Logging Notebook, Workflow and Debugging | |
9/25 Fri | Recitation 4 | recitation4.pdf (solutions) | |
10/2 Fri | Lecture | ||
10/9 Fri | Recitation 5 | recitation5.pdf (solutions) | |
10/16 Fri | Recitation 6 | recitation6.pdf (solutions) | |
10/30 Fri | Recitation 7 | recitation7.pdf (solutions) | |
11/13 Fri | Recitation 8 | recitation8.pdf (solutions) | |
11/23 Fri | Recitation 9 | recitation9.pdf (solutions) | 12/4 Fri | Recitation 10 | recitation10.pdf (solutions) |
The course includes two midterm exams and a final exam. The midterms will both take place during your lecture timeslot on Monday, Oct. 5 and Monday, Nov. 9. The final exam is on Monday, Dec. 14, 8:30-11:30 am or 1:00-4:00 pm. Plan any travel around exams, as exams cannot be rescheduled.
There will be approximately nine homework assignments that will have some combination of written and programming components. Written components will involve working through algorithms presented in the class, deriving and proving mathematical results, and critically analyzing material presented in class. Programming components will involve writing code in Python, C++, or Java to implement various algorithms.
For any assignments that aren't released yet, the dates below are tentative and subject to change.
Assignment | Link (if released) | Due Date |
---|---|---|
HW 1 (written, programming) | Gradescope and Piazza Resources | 9/10 Thu, 11:59 pm |
HW 2 (written, programming) | Gradescope and Piazza Resources | 9/21 Mon, 11:59 pm |
HW 3 (written) | Gradescope | 9/28 Mon, 11:59 pm |
HW 4 (written, programming) | Gradescope and Piazza Resources | 10/14 Wed, 11:59 pm |
HW 5 (written, programming) | Gradescope and Piazza Resources | 10/26 Mon, 11:59 pm |
HW 6 (written) | Gradescope | 11/2 Mon, 11:59 pm |
HW 7 (written, programming) | Gradescope and Piazza Resources | 11/19 Thu, 11:59 pm |
HW 8 (written, programming) | Gradescope and Piazza Resources | 12/3 Thu, 11:59 pm |
HW 9 (written) | Gradescope | 12/9 Wed, 11:59 pm |
Grades will be collected and reported in Canvas. Please let us know if you believe there to be an error the grade reported in Canvas.
Final scores will be composed of:
Participation will be based on the percentage of in-class polling questions answered:
Correctness of in-class polling responses will not be taken into account for participation grades.
It is against the course academic integrity policy to answer in-class polls when you are not present in lecture. Violations of this policy will be reported as an academic integrity violation. Information about academic integrity at CMU may be found at https://www.cmu.edu/academic-integrity.
There will be a few other means to collect participation points; stay tuned to Piazza for more details.
This class is not curved. However, we convert final course scores to letter grades based on grade boundaries that are determined at the end of the semester. What follows is a rough guide to how course grades will be established, not a precise formula — we will fine-tune cutoffs and other details as we see fit after the end of the course. This is meant to help you set expectations and take action if your trajectory in the class does not take you to the grade you are hoping for. So, here's a rough, very rough heuristics about the correlation between final grades and total scores:
Grades for graduate students will be broken down further with +/- distinctions. See CMU grading polices for more information.
The above heuristic assumes that the makeup of a student's grade is not wildly anomalous: exceptionally low overall scores on exams, programming assignments, or written assignments will be treated on a case-by-case basis.
Precise grade cutoffs will not be discussed at any point during or after the semester. For students very close to grade boundaries, instructors may, at their discretion, consider participation in lecture and recitation, exam performance, and overall grade trends when assigning the final grade.
Homework assignments:
Aside from this, there will be no extensions on assignments in general. If you think you
really really need an extension on a particular assignment, contact the instructor as soon as possible
and before the deadline. Please be aware that extensions are entirely discretionary and will be granted only
in exceptional circumstances outside of your control (e.g., due to severe illness or major personal/family
emergencies, but not for competitions, club-related events or interviews). The instructors will require
confirmation from University Health Services or your academic advisor, as appropriate.
Nearly all situations that make you run late on an assignment homework can be avoided with proper planning —
often just starting early. Here are some examples:
We encourage you to discuss course content and assignments with your classmates. However, these discussions must be kept at a conceptual level only.
Violations of these policies will be reported as an academic integrity violation. Information about academic integrity at CMU may be found at https://www.cmu.edu/academic-integrity. Please contact the instructor if you ever have any questions regarding academic integrity or these collaboration policies.
If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to visit their website.
Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising,
getting enough sleep, and taking some time to relax. This will help you achieve your goals and cope with
stress.
All of us benefit from support during times of struggle. There are many helpful resources available on campus
and an important part of the college experience is learning how to ask for help. Asking for support sooner
rather than later is almost always helpful.
If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or
depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to
help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/. Consider reaching out to a friend,
faculty or family member you trust for help getting connected to the support that can help.
If you have questions about this or your coursework, please let us know. Thank you, and have a great semester.
For this class, we are conducting research on teaching and learning. This research will involve some student work. You will not be asked to do anything above and beyond the normal learning activities and assignments that are part of this course. You are free not to participate in this research, and your participation will have no influence on your grade for this course or your academic career at CMU. If you do not wish to participate, please send an email to Chad Hershock (hershock@andrew.cmu.edu). Participants will not receive any compensation. The data collected as part of this research will include student grades. All analyses of data from participants’ coursework will be conducted after the course is over and final grades are submitted. The Eberly Center may provide support on this research project regarding data analysis and interpretation. The Eberly Center for Teaching Excellence and Educational Innovation is located on the CMU-Pittsburgh Campus and its mission is to support the professional development of all CMU instructors regarding teaching and learning. To minimize the risk of breach of confidentiality, the Eberly Center will never have access to data from this course containing your personal identifiers. All data will be analyzed in de-identified form and presented in the aggregate, without any personal identifiers. If you have questions pertaining to your rights as a research participant, or to report concerns to this study, please contact Chad Hershock (hershock@andrew.cmu.edu).