Course Overview

In this course students will gain exposure to practical aspects of machine learning and statistical data analysis. Students are expected to complete a semester-long project entailing an end-to-end application of machine learning. Through project assignments, lectures, discussions, and readings, students will learn about the intricacies involved in the practical application of ML, and will experience building machine learning systems for real-world problems and data. Students will develop skills in problem formulation, working with messy data, making ML design choices, developing reproducible pipelines, evaluating the impact of deployed models, and understanding and addressing concerns beyond accuracy (e.g., privacy, bias, efficiency).

Given recent advances in ML tooling, infrastructure, and modeling (e.g., pretrained image and language models), this semester will focus specifically on understanding the degree to which these advancements impact real-world applications of ML. Through careful consideration of end-to-end applications of ML, we will explore how recent models, tools, and infrastructure can potentially improve outcomes, where they fall short, and what new challenges and risks these technologies present.

Prerequisites

This course is intended for MLD PhD and MS students. There are no formal prerequisites for this course beyond a general knowledge of machine learning principles and an interest in how to apply them to real problems.

Textbooks

There are no required textbooks for this course; all required readings will be in the form of papers, provided below.




Course Components

Format

This course will cover topics on the use of machine learning in practice through an independent, hands-on project, as well as reading and discussion of materials including research papers, articles, and blog posts. Students will participate in discussion and presentation of the assigned reading material and will complete a semester-long project.

Paper discussions will loosely follow the role-playing seminar format; for each paper, you will either be assigned a presenter role or complete the non-presenter assignment (described below).

Presenter assignment: You will periodically present a paper in class, taking on one of the following presenter roles.

  1. Educator: Explain the key ideas in the paper to the class. Keep your presentation to 5-10 minutes maximum.
  2. Investigator: Investigate one of the paper’s authors. What is their area of expertise? Where have they worked previously? What prior projects might have led to working on this one? Explore what motivated them to write the paper and what biases they may have.
  3. Reviewer: Discuss both one strength and one weakness of the paper. What did you like about the paper? Do you agree with the paper's stance/findings? Did the paper overlook anything? Is there something that could have strengthened the work?
  4. Discussion Leader: Prepare three discussion questions to ask the class, and lead the discussion amongst the students when answering these questions.

Non-presenter assignment: If you are not presenting the paper, you must still read the paper and prepare the following item:


Project: All students in the class will additionaly complete a semester-long project that explores an application of machine learning on real-world data. Students will complete the project individually (not in groups). There will be several assignments, small group discussions, and presentations related to the project throughout the semester (see details below).



Grading

The requirements of this course consist of participating in and leading discussion sessions as well as completing a course project. The grading breakdown is as follows:



Piazza

We will use Piazza for class discussions. Go to this Piazza website to join the course forum (note: you must use a cmu.edu email account to join). We strongly encourage students to post on this forum rather than emailing the course staff directly; this will be more efficient for both students and staff. Students should use Piazza to:

The course Academic Integrity Policy must be followed on the message boards at all times. Please be polite.




Schedule (Subject to Change)

Date Class Type Topic Required Reading Assignments Additional Resources
Jan 16 Lecture

Warning: The structure of this class may be different than other courses you've taken! What to expect and what I expect from you.

Jan 18 Lecture

A deep dive into the course project and considerations for selecting a problem/dataset, including whether or not ML is the best approach.

Project Scoping Guide
Part I: ML Pipelines in Practice
Jan 23 Discussion

The field of ML has rapidly transformed in recent years. How did we get here? How might recent advancements improve real-world applications of ML? Alternatively, what new challenges do these technologies present, and where might they fall short?

Data Science at the Singularity
Artificial Intelligence—The Revolution Hasn’t Happened Yet
Winner's Curse?
Video
Jan 25 Discussion
How should we formulate a machine learning problem, particularly in light of the data at hand? What are the goals of the model and how can we measure success? We will explore possible pitfalls to consider as well as strategies to prevent them.

Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations Framing an ML Problem
Video
Jan 30 Discussion
Why is it important to explore your data before applying ML? What tools/techniques exist to effectively explore, prepare, and clean your data? We will consider pros/cons of existing techniques for data preparation.

Pervasive Label Errors in Test Sets Destabilize ML Benchmarks
Can Foundation Models Wrangle Your Data?
Assignment 0 due Tutorial on Visualization for ML
EDA Example
Feb 1 Discussion
Before exploring more complex modeling choices it's important to understand the performance of a common-sense baseline. How can you select a good baseline?

On the Difficulty of Evaluating Baselines Always Start With a Stupid Model
Feb 6 Project Project Check-Ins
Feb 8 Discussion
In many real-world applications, expressing domain expertise through feature engineering can dramatically improve model performance. We will discuss techniques for data/feature engineering and to what extent deep learning `solves' the issue of featurization.

Data Augmentation as Feature Manipulation Assignment 1 due Augmentation Engineers
Missing Data Imputation
Techniques for Feature Engineering
Feb 13 No Class No Class (Cancelled)
Feb 15 Speaker Ananya Joshi (CMU), ML for Time Series Data
Feb 20 Discussion
How do you decide which models are better than others? We will discuss validation strategies, as well as considerations beyond accuracy in model selection (e.g., efficiency, fairness, explainability).

The Secrets of Machine Learning arg min blog
Feb 22 Discussion
What are methods and best practices for performing ML hyperparameter tuning? How might your approach change when you have a limited budget for tuning?
Optimizer Benchmarking Needs to Account for Hyperparameter Tuning
Feb 27 Project Project Pitches Assignment 2 due
Feb 29 Project Project Pitches
Mar 5 No Class No Class (Spring Break)
Mar 7 No Class No Class (Spring Break)
Mar 12 Discussion
How can we combine models to improve performance? We will cover classical techniques in ensembling as well as more recent approaches building on these ideas.
Model Soups
Part II: Beyond Accuracy on the Test Set: Considerations for ML Deployment
Mar 14 Discussion
What can we do when the data distribution at deployment differs from training? We will consider potential types of shift, as well as practical approaches for detection and mitigation.
Reliable and Trustworthy Machine Learning for Health Using Dataset Shift Detection
The Effect of Natural Distribution Shift on Question Answering Models
In Search of Lost Domain Generalization
Mar 19 Discussion
We will explore methods for model compression, including practical considerations for development and deployment in resource-constrained settings. We will also consider how compression impacts concerns beyond accuracy, such as robustness and fairness.
Model Compression in Practice
Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy
Mar 21 Discussion
What does fairness mean in the context of ML? We will explore specific notions of fairness in connection to regulatory compliance, and will discuss more broadly how bias may arise in ML systems and how we may hope to mitigate it.
Attacking Discrimination with Smarter ML
How AI Bias Happens
Fair ML Book
Fair ML Tutorial
Can you make AI fairer than a judge?
Google Gemini
Mar 26 Discussion
What does it mean for an ML model to be interpretable and/or explainable? To what degree can we hope to achieve this for general models? We will discuss challenges and best practices around improving the interpretability, explainability, and usability of ML in real-world applications.
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Interpretable Machine Learning: Moving from Mythos to Diagnostics
The Mythos of Model Interpretability
Mar 28 Discussion
As ML applications become more prevalent they are at an increased risk for being attacked or misused. We will consider how an adversary may attack an ML system; what may be done in practice to try to prevent this; and what attacks may look like in modern systems.
Attacking ML with Adversarial Examples
Jailbroken: How Does LLM Safety Training Fail?
Practical Black-Box Attacks against Machine Learning
Universal and Transferable Adversarial Attacks on Aligned Language Models
Apr 2 Project Project Updates Assignment 3 due
Apr 4 Discussion
Data-hungry ML models are increasingly trained using potentially sensitive user data. In what ways might an ML system leak sensitive information from the training data/users? What can we do in practice to mitigate this concern?
The Secret Sharer
Considerations for Differentially Private Learning with Large-Scale Public Pretraining
Lecture Notes on DP (Gautam Kamath)
Apr 9 Discussion Causality The Seven Tools of Causal Inference
Sparse Feature Circuits
MSR Webinar
Apr 11 No Class No class (Spring Carnival)
Apr 16 Discussion AI Regulation AI Bill of Rights
Gemma Paper
Dangers of Open-Source AI
Dangers of Closed-Source AI
AI Executive Order
Apr 18 Speaker
Large language models (LLMs) are increasingly being used in pipelines that repeatedly process or generate data of some sort. Despite their usefulness, LLM pipelines often produce errors, typically identified through manual “vibe checks” by developers. This talk explores automating this process using evaluation assistants, presenting a method for automatically generating assertions. We share insights from a deployment with LangChain, where we auto-generated assertions for a number of real-world LLM pipelines. Finally, we discuss insights from a separate qualitative study of how engineers use evaluation assistants: we highlight the subjective nature of "good" assertions and how they adapt over time with changes in prompts, data, LLMs, and pipeline components.
Apr 23 Speaker
In this talk, we will explore challenges faced by and possible solutions for practitioners evaluating machine learning models. First, we’ll motivate and discuss methods to perform disaggregated evaluations that go beyond aggregate performance metrics. Such disaggregated evaluations can help practitioners anticipate and address algorithmic bias and reliance on spurious correlations. To ground our discussion, we'll discuss several algorithmic and interactive tools to identify where a model underperforms for applications with unstructured data (such as medical imaging), which present unique challenges for model evaluation. We’ll conclude with an open discussion about how existing paradigms for evaluating predictive ML models may or may not apply in the age of generative AI.
What did my AI learn?
Apr 25 Project Project Presentations Assignment 4 due




General Policies

Late Policy

Students are expected to attend class in-person; we will not have remote participation capabilities. If you miss a discussion class without completing the corresponding assignment, you will get a zero for that session. If you miss a class where you are in a "presenting" role, you must still create the presentation for that role before the class and you must find someone else to present it for you. If you miss a class where you'd be in a "non-presenting" role, to get credit you need to complete the non-presenter assignment and submit it before the start of class. There's no way to accept late work for the readings since we're all reading the same papers at the same time.

For the project assignments, all students will have one grace day **total** that can be used on one of the four project assignments. After this grace day has been used, late project assignments are only eligible for 75% of the points the first day (24-hour period) after the deadline, 50% the second, and 25% the third. We can't accept the final project after the final class slot since you need to present it then.

Audit Policy

Official auditing of the course (i.e. taking the course for an “Audit” grade) is not permitted this semester. Unofficial auditing of the course (e.g., attending classes and participating in discussion) is welcome with instructor approval.

Pass/Fail Policy

Pass/Fail is allowed in this class; no permission is required from the course staff. The grade for the Pass cutoff will depend on your program. Be sure to check with your program / department as to whether you can count a Pass/Fail course towards your degree requirements.

Accommodations for Students with Disabilities

If you have a disability and have an accommodations letter from the Disability Resources office, please discuss your accommodations and needs with Professor Smith as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, please contact them at: access@andrew.cmu.edu.


Academic Integrity Policies

Read this Carefully

Collaboration among Students

Duty to Protect One’s Work

Students are responsible for proactively protecting their work from copying and misuse by other students. If a student’s work is copied by another student, the original author is also considered to be at fault and in violation of the course policies. It does not matter whether the author allowed the work to be copied or was merely negligent in preventing it from being copied. When overlapping work is submitted by different students, both students will be punished.

To protect future students, do not post your solutions publicly, neither during the course nor afterwards.

Penalties for Violations of Course Policies

All violations (even first one) of course policies will always be reported to the university authorities (your Department Head, Associate Dean, Dean of Student Affairs, etc.) as an official Academic Integrity Violation and will carry severe penalties.
  1. The penalty for the first violation is a one-and-a-half letter grade reduction. For example, if your final letter grade for the course was to be an A-, it would become a C+.
  2. The penalty for the second violation is failure in the course, and can even lead to dismissal from the university.