Introduction to Machine Learning

10-701, Spring 2021

Carnegie Mellon University

Geoff Gordon, Aarti Singh


Home Teaching Staff Lecture Schedule Recitations Homeworks

Lecture:

Day and Time: Monday and Wednesday, 4:00 - 5:20 pm
Location: Remote (connect via Zoom link on Canvas)

Recitation: Day and Time: Friday, 4:00-5:20 pm
Location: Remote (connect via Zoom link on Canvas)

Office Hours:
Day Time Location Staff
Mondays 10:00-11:00 am Zoom link on Canvas Arundhati Banerjee
Mondays 5:30-6:30 pm Zoom link on Canvas Yuchen Shen
Tuesdays 4:20-5:20 pm Zoom link on Canvas Jeffrey Huang
Wednesdays 5:30-6:30 pm Same link as lecture- Zoom link on Canvas Geoff Gordon
Thursdays 9:00-10:00 am Zoom link on Canvas Aarti Singh
Thursdays 3:30-4:30 pm Zoom link on Canvas Stefani Karp
Fridays 10:30-11:30 pm Zoom link on Canvas Zhe Chen
Saturdays 1:30-2:50 pm In-person section only: details on SIO Jeffrey Tsaw

Note: All events for this course are in Pittsburgh time (EST). While we understand that many of you may have conflicts, we ask that you make an effort to be available for important class events such as exams. All lectures and recitations will be recorded, and the lecture recordings will be available at the Zoom link on Canvas ONLY for the use of students in this course. Please note that you are not allowed to share the recording anywhere to protect the FERPA rights of all students in the classroom. Breakout rooms and Office hours will NOT be recorded.


Course Description:

Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize human faces, recommend music and movies, and drive autonomous robots). This course covers the core concepts, theory, algorithms and applications of machine learning. We cover supervised learning topics such as classification (Naive Bayes, Logistic regression, Support Vector Machines, neural networks, k-NN, decision trees) and regression (linear, nonlinear, kernel, nonparametric), unsupervised learning (density estimation, MLE, MAP, mixture models, clustering, PCA, dimensionality reduction), as well as graphical models, HMMs and Reinforcement learning. The core concepts include generalization, overfitting, regularization, model selection, fairness and related issues. Theoretical tools covered include PAC bounds and Rademacher complexity. Programming assignments include hands-on experiments with various learning algorithms. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.

Prerequisites: Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, calculus, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate. In addition, recitation sessions will be held to revise some basic concepts. Please also take the self-assessment exam to make sure you have the right background.

Learning Outcomes: After completing the course, students will be able to:
  • Implement and analyze existing learning algorithms, including well-studied methods for classification, regression, density estimation, clustering, dimensionality reduction, graphical models and reinforcement learning
  • Integrate multiple facets of practical machine learning in a single system: data preprocessing, learning, regularization and model selection
  • Describe the the formal properties of models and algorithms for learning and explain the practical implications of those results
  • Compare and contrast different paradigms for learning (supervised, unsupervised, etc.)
  • Design experiments to evaluate and compare different machine learning techniques on real-world problems
Recommended Textbooks:
  • Pattern Recognition and Machine Learning, Christopher Bishop (available online)
  • Machine Learning: A probabilistic perspective, Kevin Murphy (available online)
  • Machine Learning, Tom Mitchell.
  • The Elements of Statistical Learning: Data Mining, Inference and Prediction, Trevor Hastie, Robert Tibshirani, Jerome Friedman.
Grading:
  • 5 Homeworks (50%)
  • 2 Depth exercises (20%)
  • Participation (3%)
  • Midterm and final exam (12+15=27%)
Grades will be collected on Canvas.

Homeworks: Homeworks will be released here and student will turn it in via Gradescope.

Late Days:
  • There are a total of 8 late days across all homeworks, with no more than 2 late days per homework. (UPDATE: We have increased the total number of late days from 7 to 8. We added this additional late day to everyone’s bank to help out in case of side effects from getting the COVID vaccine. We hope that everyone is staying safe and that those who are eligible are signed up to be vaccinated! Note that this doesn’t change the max late days per assignment.)
  • HWs submitted after all late days are exhausted will be awarded 50% points if submitted within 24 hrs after the late days are exhausted and 0% after that.
  • Late days are for homeworks only and may not be transferred to depth exercises.
Communication: All class discussions (outside of lectures, recitations and office hours), announcements and other communication will take place via Piazza.

Policies:
Extensions In general, we do not grant extensions on assignments. There are several exceptions:
  • Medical Emergencies: If you are sick and unable to complete an assignment or attend class, please go to University Health Services. For minor illnesses, we expect grace days or our late penalties to provide sufficient accommodation. For medical emergencies (e.g. prolonged hospitalization), students may request an extension afterwards and should include a note from University Health Services.
  • Family/Personal Emergencies: If you have a family emergency (e.g. death in the family) or a personal emergency (e.g. mental health crisis), please contact your academic adviser or Counseling and Psychological Services (CaPS). In addition to offering support, they will reach out to the instructors for all your courses on your behalf to request an extension.
  • University-Approved Absences: If you are attending an out-of-town university approved event (e.g. multi-day athletic/academic trip organized by the university), you may request an extension for the duration of the trip. You must provide confirmation of your attendance, usually from a faculty or staff organizer of the event.
For any of the above situations, you may request an extension by emailing Brynn Edmunds (bedmunds@andrew.cmu.edu). The email should be sent as soon as you are aware of the conflict and at least 5 days prior to the deadline. In the case of an emergency, no notice is needed.

Collaboration
  • The purpose of student collaboration is to facilitate learning, not to circumvent it. Studying the material in groups is strongly encouraged. It is also allowed to seek help from other students in understanding the material needed to solve a particular homework problem, provided no written notes (including code) are shared, or are taken at that time, and provided learning is facilitated, not circumvented. The actual solution must be done by each student alone.
  • The presence or absence of any form of help or collaboration, whether given or received, must be explicitly stated and disclosed in full by all involved. Specifically, each assignment solution must include answering the following questions:
    • Did you receive any help whatsoever from anyone in solving this assignment? Yes / No.
      • If you answered 'yes', give full details: ____________
      • (e.g. "Jane Doe explained to me what is asked in Question 3.4")
    • Did you give any help whatsoever to anyone in solving this assignment? Yes / No.
      • If you answered 'yes', give full details: ____________
      • (e.g. "I pointed Joe Smith to section 2.3 since he didn’t know how to proceed with Question 2")
    • Did you find or come across code that implements any part of this assignment ? Yes / No. (See below policy on "found code")
      • If you answered 'yes', give full details: ____________
      • (book & page, URL & location within the page, etc.).
  • If you gave help after turning in your own assignment and/or after answering the questions above, you must update your answers before the assignment's deadline, if necessary by emailing the course staff.
Previously Used Assignments Some of the homework assignments used in this class may have been used in prior versions of this class, or in classes at other institutions, or elsewhere. Solutions to them may be, or may have been, available online, or from other people or sources. It is explicitly forbidden to use any such sources, or to consult people who have solved these problems before. It is explicitly forbidden to search for these problems or their solutions on the internet. You must solve the homework assignments completely on your own. We will be actively monitoring your compliance. Collaboration with other students who are currently taking the class is allowed, but only under the conditions stated above.

Policy Regarding "Found Code" You are encouraged to read books and other instructional materials, both online and offline, to help you understand the concepts and algorithms taught in class. These materials may contain example code or pseudo code, which may help you better understand an algorithm or an implementation detail. However, when you implement your own solution to an assignment, you must put all materials aside, and write your code completely on your own, starting "from scratch". Specifically, you may not use any code you found or came across. If you find or come across code that implements any part of your assignment, you must disclose this fact in your collaboration statement.

Duty to Protect One's Work Students are responsible for pro-actively protecting their work from copying and misuse by other students. If a student's work is copied by another student, the original author is also considered to be at fault and in gross violation of the course policies. It does not matter whether the author allowed the work to be copied or was merely negligent in preventing it from being copied. When overlapping work is submitted by different students, both students will be punished. To protect future students, do not post your solutions publicly, neither during the course nor afterwards.

Penalties for Violations of Course Policies All violations (even first one) of course policies will always be reported to the university authorities (your Department Head, Associate Dean, Dean of Student Affairs, etc.) as an official Academic Integrity Violation and will carry severe penalties.

  • The penalty for the first violation is a one-and-a-half letter grade reduction. For example, if your final letter grade for the course was to be an A-, it would become a C+.
  • The penalty for the second violation is failure in the course, and can even lead to dismissal from the university.
Academic Integrity Any violations of academic integrity will always be reported to the university authorities (your Department Head, Associate Dean, Dean of Student Affairs, etc.) as an official Academic Integrity Violation, in compliance with CMU's Policy on Academic Integrity, and will carry severe penalties.

Audits and Pass/Fail Audits allowed (with some minimal requirement). Pass/Fail allowed.

Wellness: Take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress.

All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful. If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 412-268-2922 and visit their website at http://www.cmu.edu/counseling/ . Consider reaching out to a friend, faculty or family member you trust for help getting connected to the support that can help.

If you or someone you know is feeling suicidal or in danger of self-harm, call someone immediately, day or night:
CaPS: 412-268-2922
Re:solve Crisis Network: 888-796-8226
If the situation is life threatening, call the police:
On campus: CMU Police: 412-268-2323
Off campus: 911.

If you have questions about this or your coursework, please let the instructors know.
Disability support: If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to contact them at access@andrew.cmu.edu.
Diversity: Diversity is fundamental to building and maintaining an equitable and inclusive campus community. Each of us is responsible for creating a safer, more inclusive environment. Unfortunately, incidents of bias or discrimination do occur, whether intentional or unintentional. We encourage anyone who experiences or observes unfair or hostile treatment on the basis of identity to speak out for justice and support by either (1) contacting Center for Student Diversity and Inclusion: csdi@andrew.cmu.edu, (412) 268-2150, or (2) reporting it anonymously at reportit.net using username: tartans, password: plaid