Course Overview

Machine learning pipelines are increasingly powered by data from multiple sources and stakeholders. Techniques for collaborative learning, such as federated learning, stand to power a new generation of machine learning applications by enabling coordinated, trustworthy learning between multiple parties and across diverse data sources. To do so, novel approaches must be developed that improve the accuracy and efficiency of learning across siloed data; mitigate risk and protect data privacy and ownership; and incorporate social and economic principles that incentivize data sharing and provide trustworthy cooperative learning schemes. This seminar course will cover various aspects of federated and collaborative learning, with a focus on recent research developments and discussion on future directions.

Prerequisites

Students are required to have taken an introductory machine learning course (e.g., 10-301, 10-315, 10-601, 10-701, 10-715 or equivalent).

Textbooks

There are no required textbooks for this course; all required readings will be in the form of papers, provided below.




Course Components

Format

This course will cover topics in the area of federated and collaborative learning through reading and discussion of recent research papers. Students will participate in discussion and presentation of papers and will complete a research project in the area of collaborative learning.

Paper discussions will use the role-playing seminar format inspired by Alec Jacobson and Colin Raffel.

Presenter assignment: You will present a paper every other class, taking on one of the following presenter roles.

  1. Reviewer: Complete a full review of the paper as if it were submitted to a conference. Follow the guidelines for NeurIPS reviewers to produce your review. In particular, please answer Questions 1 to 10 under "Review Form", including assigning an overall score.
  2. Archaeologist: Determine where this paper sits in the context of previous and subsequent work. Find and briefly report on both: (1) a prior paper that substantially influenced the current paper, and (2) a more recent paper that cites this current paper.
  3. Researcher: You’re a researcher working on a new project in this area. Propose an imaginary follow-up project that builds on the current paper. Pretend that this new project has been successful, and write up a brief introduction for a paper about your project using the five-point structure provided here (under "The Introduction").
  4. Investigator: You are a detective who needs to run a background check on one of the paper’s authors. Where have they worked? What did they study? What previous projects might have led to working on this one? What motivated them to work on this project?
  5. Practitioner: You work at a company or organization developing an application or product of your choice (one that has not already been suggested in a prior session). Describe the application/product in detail, and bring a convincing pitch for why you should be paid to implement the method in the paper for this particular application.
  6. Social Impact Assessor: Identify how this paper self-assesses its positive or negative impact on the world. Have any additional positive social impacts been left out? What are possible negative social impacts that were overlooked or omitted? Please read this short paper to see examples.
  7. Hacker (Optional): One student may optionally chose to implement a key method from the paper on a small dataset or toy problem. Prepare to share your code with the class and present the results of your experiments, highlighting whether or not you were able to replicate the results in the paper. You are welcome to use (and give credit to) an existing implementation for "backbone" code, but do not simply download and run an existing implementation---you should implement the core method from the paper yourself. As this role is more involved, if a student selects this role they can opt to sit out of a future presentation and use their received grade in lieu of participation that day.

Non-presenter assignment: If you are not presenting the paper, you must still read the paper and prepare the following two items:

  1. Develop an alternative title for the paper or a create a picture illustrating a key concept from the paper.
  2. Provide at least one question about the paper (e.g., something you're confused about or would like to hear discussed).

Project: All students in the class will additionaly complete a research project in the area of collaborative learning and write a short 4-page paper. Students are welcome to work in groups of 2-3, and must include a "contributions" paragraph in their paper that concretely lists each author's contributions. All groups will be required to submit a one-page project proposal by the start of class on October 26th. The project proposal is a short, informal description of what you intend to do (experiments, datasets, methods, etc). Groups will present their final projects during the last class of the semester. All students in each group are required to present some material during the final presentation.



Grading

The requirements of this course consist of participating in and leading discussion sessions and completing a course project. The grading breakdown is as follows:



Piazza

We will use Piazza for class discussions. Go to this Piazza website to join the course forum (note: you must use a cmu.edu email account to join). We strongly encourage students to post on this forum rather than emailing the course staff directly; this will be more efficient for both students and staff. Students should use Piazza to:

The course Academic Integrity Policy must be followed on the message boards at all times. Please be polite.




Schedule (Subject to Change)

Date Content Resources
Aug 29 Introduction, logistics, and background: Part I (video) FL Surveys (1,2)
Blog post
Aug 31 Introduction, logistics, and background: Part II (video) NeurIPS tutorial
Sep 5 Communication-Efficient Learning of Deep Networks from Decentralized Data
Sep 7 Towards Federated Learning at Scale: System Design MLSys video
Sep 12 Federated Optimization in Heterogeneous Networks A Field Guide to Federated Optimization
FLOW seminar
Sep 14 SCAFFOLD: Stochastic Controlled Averaging for Federated Learning FLOW seminar
Sep 19 Is Local SGD Better than Minibatch SGD?
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
Survey on Federated Optimization
On Fifth Generation Local Training Methods
Sep 21 On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data FLOW seminar
On the Still Unreasonable Effectiveness of Federated Averaging
Sep 26 Federated Multi-Task Learning
An Efficient Framework for Clustered Federated Learning
Sep 28 Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach
Oct 3 Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
Oct 5 Client-Customized Adaptation for Parameter-Efficient Federated Learning
Oct 10 Privacy Lecture Federated Learning and Privacy
Oct 12 Learning Differentially Private Recurrent Language Models Algorithmic Foundations of DP
The Definition of DP
SaTML Tutorial on DP
Oct 17 No Class (Fall Break)
Oct 19 No Class (Fall Break)
Oct 24 Practical Secure Aggregation for Privacy-Preserving Machine Learning CCS video
Oct 26 Project Pitches
Oct 31 Inverting Gradients-How easy is it to break privacy in federated learning? When the Curious Abandon Honesty: Federated Learning Is Not Private
Nov 2 On Privacy and Personalization in Cross-Silo Federated Learning blog post
Nov 7 No Class (Election Day)
Nov 9 How To Backdoor Federated Learning
Can You Really Backdoor Federated Learning?
Attack of the Tails (video)
Nov 14 Fair Resource Allocation in Federated Learning Agnostic Federated Learning
Nov 16 Guest Lecture: Sai Praneeth Karimireddy
Federated Learning as Mechanism Design
Nov 21 Project Office Hours
Nov 23 No Class (Thanksgiving)
Nov 28 Model-sharing Games: Analyzing Federated Learning Under Voluntary Participation
Nov 30 A Principled Approach to Data Valuation for Federated Learning
Dec 5 Guest Lecture: Zachary Charles
Dec 7 Project Presentations




General Policies

Late Policy

Students are expected to attend class in-person; we will not have remote participation capabilities. If you miss a class without completing the corresponding assignment, you will get a zero for that session. If you miss a class where you are in a "presenting" role for that session, you must still create the presentation for that role before the class and you must find someone else to present it for you. If you miss a class where you'd be in a "non-presenting" role, to get credit for that session you need to complete the non-presenting assignment and send it to me before the start of class. There's really no way to accept late work for the readings since it's vital that we're all reading the same papers at the same time. I also can't accept the final project after the scheduled final exam slot since you need to present it then.

Audit Policy

Official auditing of the course (i.e. taking the course for an “Audit” grade) is not permitted this semester. Unofficial auditing of the course (e.g., attending classes and participating in discussion) is welcome with instructor approval.

Pass/Fail Policy

Pass/Fail is allowed in this class; no permission is required from the course staff. The grade for the Pass cutoff will depend on your program. Be sure to check with your program / department as to whether you can count a Pass/Fail course towards your degree requirements.

Accommodations for Students with Disabilities

If you have a disability and have an accommodations letter from the Disability Resources office, please discuss your accommodations and needs with Professor Smith as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, please contact them at: access@andrew.cmu.edu.


Academic Integrity Policies

Read this Carefully

Collaboration among Students

Duty to Protect One’s Work

Students are responsible for proactively protecting their work from copying and misuse by other students. If a student’s work is copied by another student, the original author is also considered to be at fault and in violation of the course policies. It does not matter whether the author allowed the work to be copied or was merely negligent in preventing it from being copied. When overlapping work is submitted by different students, both students will be punished.

To protect future students, do not post your solutions publicly, neither during the course nor afterwards.

Penalties for Violations of Course Policies

All violations (even first one) of course policies will always be reported to the university authorities (your Department Head, Associate Dean, Dean of Student Affairs, etc.) as an official Academic Integrity Violation and will carry severe penalties.
  1. The penalty for the first violation is a one-and-a-half letter grade reduction. For example, if your final letter grade for the course was to be an A-, it would become a C+.
  2. The penalty for the second violation is failure in the course, and can even lead to dismissal from the university.


Acknowledgments

The format of this course is based on the role-playing seminar format outlined by Alec Jacobson and Colin Raffel.