Fall 2023
The course will focus on multi-step imperfect-information games because
most real-world strategic settings are such games. Such games beget additional
issues beyond perfect-information games like chess and Go, such as signaling,
deception, and understanding deception by others. There has been tremendous
progress in the AI community on solving such games since around 2003. This
course covers the fundamentals and the state of the art of solving such games.
Instructors |
Office |
Email |
GHC 9205 |
sandholm AT cs |
|
GHC 9114 |
smcaleer AT cs |
· Lecture time: Mondays & Wednesdays 11:00 AM - 12:20 PM. No class on 9/4, 10/16, 10/18, or 11/22 due to University Holidays.
· Lecture location: Wean Hall 4625
· Office hours: Tuomas: after class Monday GHC 9205; Stephen: after class Wednesday GHC 9114.
The course will be lecture based. At the end of the course
there will be a few lectures of project presentations by students.
Readings will consist of a mixture of papers and course notes.
Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.
Grading will be as follows:
·
50%
final project
·
40%
homework sets (there will be 2 homework sets that may include both
paper-and-pen questions and programming assignments)
·
10%
completion of readings, attendance, and participation in class discussions
Please turn in your project proposal via email by Wednesday, Oct 4th.
Please turn in your project writeup via email by Friday, Dec 8th.
Released |
Due |
Files |
|
Homework 1 |
Sept. 18 |
Oct. 2 (beginning of class) or Oct. 6 for a 20% penalty |
|
Homework 2 |
Oct. 6 |
Oct. 23 (beginning of class) |
Lecture |
Date |
Topic |
Reading(s) |
Lecture notes |
1 |
8/28 |
Introduction Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games. |
||
2 |
8/30 |
Perfect-information games Tree search methods for two-player perfect-information games. AlphaGo and AlphaZero. |
||
3 |
9/6 |
Imperfect-information games 1 Review of solution concepts. LP formulation. Fictitious play and follow the regularized leader (FTRL). Regret. Multiplicative weights (MWU). Regret matching (RM) and RM+. Optimism. |
||
4 |
9/11 |
Imperfect-information games 2 Markov decision processes (MDPs) and introduction to reinforcement learning. Markov games and partially-observable Markov games. Extensive-form games. Behavioral representation of a strategy. Sequence-form representation. Kuhn's theorem: relationship between normal-form and sequence-form strategies. Bottom-up decomposition of the sequence-form polytope. |
||
5 |
9/13 |
Counterfactual regret minimization Treeplex case: regret circuits for Cartesian products and for convex hull. Construction of CFR and pseudocode; proof of correctness and convergence speed. Alternation. Hard instances. Linear averaging. |
[Zinkevich et al., NIPS’07] |
|
6 |
9/18 |
Provably correct techniques for speeding up CFR algorithms Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies. Optimistic regret minimization algorithms. |
[Brown & Sandholm, ICML’17] |
|
7 |
9/20 |
Game abstraction 1: Practical state of the art Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction. |
||
8 |
9/25 |
Game abstraction 2 Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling. |
||
9 |
9/27 |
Multiagent learning dynamics Guest Lecture by Ioannis Anagnostides. |
||
10 |
10/2 |
State-of-the-art for two-player no-limit Texas hold’em: Libratus Subgame solving in imperfect-information games. Self-improver. |
||
11 |
10/4 |
State-of-the-art for multi-player no-limit Texas hold’em: Pluribus Depth-limited subgame solving. |
||
12 |
10/9 |
Knowledge-limited subgame solving (KLSS), safe KLSS, and state of the art in dark chess. Guest lecture by Brian Zhang. |
||
13 |
10/11 |
Deep learning in tree-based game solving 1 Deep CFR as an alternative to abstraction. Neural Fictitious Play [Heinrich et al. 2016], multiagent policy gradient methods. |
||
14 |
10/23 |
Deep learning in tree-based game solving 2 Loss estimation, Monte Carlo CFR (MCCFR), and sampling approaches. DREAM, ESCHER, Neural Replicator Dynamics. |
||
15 |
10/25 |
Deep learning in tree-based game solving 3 DeepNash for expert-level Stratego: State of the art for Stratego. Magnetic Mirror Descent. |
[Perolat et al., Science’22] |
|
16 |
10/30 |
Deep learning in tree-based game solving 4: State-of-the-art for Dota and Starcraft. AlphaStar and OpenAI 5. |
||
17 |
11/1 |
State-of-the-art in double oracle algorithms Double oracle (DO), policy space response oracles (PSRO), extensive-form double oracle (XDO), anytime PSRO, self-play PSRO, diversity in PSRO. |
[Lanctot et al., NeurIPS’17] |
|
18 |
11/6 |
(Near)optimality certificates
and how to learn them even in simulator-access (blackbox) games. Guest lecture by Brian Zhang. |
||
19 |
11/8 |
State-of-the-art in Diplomacy. Guest lecture by Noam Brown (OpenAI). |
[Meta Fundamental AI Research Diplomacy Team (FAIR)†, et al., Science’22] |
|
20 |
11/13 |
Team games Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but only the vertices are known and not the constraints; ways around that in practice. Team PSRO. |
||
21 |
11/15 |
Opponent exploitation Different approaches to opponent exploitation. Hybrids with game theory. Safe opponent exploitation. |
[Ganzfried & Sandholm, AAMAS’11] |
|
22 |
11/20 |
Course project presentations No class: optional project help in Tuomas' office. |
||
23 |
11/27 |
Course project presentations Carlos; Tobias and Patrick |
||
24 |
11/29 |
Course project presentations Naifeng; Arnhav and Abishek |
||
25 |
12/4 |
Course project presentations Harrison, Gil, and Cynthia; Emin |
||
26 |
12/6 |
Course project presentations May; Mingkuan; Richard and John |
Professor |
Title |
Year |
University |
Tuomas Sandholm and Gabriele Farina |
2021 |
CMU |
|
Christian Kroer |
2020 |
Columbia |
|
Tuomas Sandholm |
2015 |
CMU |
|
John P. Dickerson |
2018 |
UMD |
|
Fei Fang |
2018 |
CMU |
|
Constantinos Daskalakis |
2015 |
MIT |
|
Yiling Chen |
Topics at the Interface between Computer Science and
Economics |
2016 |
Harvard |
Vincent Conitzer |
Computational Microeconomics: Game Theory, Social Choice, and
Mechanism Design |
2016 |
Duke |
Sanmay Das |
2016 |
Wash U |
|
Ariel Procaccia |
2016 |
CMU |
|
Milind Tambe |
2016 |
USC |
|
Tim Roughgarden |
2013 |
Stanford |