Fall 2021
The course will focus on multi-step imperfect-information games because
most real-world strategic settings are such games. Such games beget additional
issues beyond perfect-information games like chess and Go, such as signaling,
deception, and understanding deception by others. There has been tremendous
progress in the AI community on solving such games since around 2003. This
course covers the fundamentals and the state of the art of solving such games.
Instructors |
Office |
Email |
GHC 9205 |
sandholm AT cs |
|
GHC 9221 |
gfarina AT cs |
· Lecture time: Tuesdays & Thursdays 11:50 AM - 1:10 PM. No class on 10/14 or 11/25 due to University Holidays.
· Lecture location: GHC 4211
· Office hours: Flexible (email us to schedule)
The course will be lecture based. At the end of the course
there will be a few lectures of project presentations by students.
Readings will consist of a mixture of papers and course notes.
Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.
Grading will be as follows:
·
50%
final project
·
40%
homework sets (there will be 2-3 homework sets that may include both
paper-and-pen questions and programming assignments)
·
10%
completion of readings, attendance, and participation in class discussions
Please turn in your project proposal via email by Thursday, Oct 7th.
Please turn in your project writeup via email by Sunday, Dec 5th.
Released |
Due |
Files |
|
Homework 1 |
Sept. 24 |
Oct. 7 (beginning of class) |
|
Homework 2 |
Oct. 9 |
Oct. 19 (beginning of class) |
Lecture |
Date |
Topic |
Reading(s) |
Lecture notes |
1 |
9/7 |
Introduction Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games. |
||
2 |
9/9 |
Representation of strategies in tree-form decision spaces Obvious (but often inappropriate for optimization) behavioral representation of a strategy. Sequence-form representation. Examples of sequence-form strategies, and computation of expected utilities given the sequence-form representation (multilinearity of expected utilities). Kuhn's theorem: relationship between normal-form and sequence-form strategies. Bottom-up decomposition of the sequence-form polytope. |
||
3 |
9/14 |
Regret minimization and hindsight rationality Phi-regret minimization. Special cases: external regret minimization, internal regret minimization, swap regret. External-regret dynamics lead to Nash equilibrium in 2-player 0-sum games, and to NFCCE in general multiplayer games. Internal regret minimization leads to Nash equilibrium in 2-player 0-sum games, and NFCE in general multiplayer games. For other choices of Phi transformations, we can arrive to EFCE and EFCCE. Special role of external regret minimization. Solution to convex-concave saddle-point problems via regret minimization; applications to bilinear saddle-point problems such as Nash equilibrium, optimal correlation, etc. |
||
4 |
9/16 |
Blackwell approachability and external regret minimization
for simplex domains Blackwell game approach and construction of regret matching (RM), RM+. |
||
5 |
9/21 |
Regret circuits and counterfactual regret minimization
(CFR) Treeplex case: regret circuits for Cartesian products and for convex hull. Construction of CFR and pseudocode; proof of correctness and convergence speed. |
||
6 |
9/23 |
Provably correct techniques for speeding up CFR algorithms Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies. |
||
7 |
9/28 |
Optimistic/predictive regret minimization via online
optimization Online projected gradient descent. Distance-generating functions. Predictive follow-the-regularized-leader (FTRL), predictive online mirror descent (OMD), and RVU bounds. Notable instantiations, e.g., optimistic hedge/multiplicative weights update. Accelerated convergence to bilinear saddle points. Dilatable global entropy. |
||
8 |
9/30 |
Predictive Blackwell approachability Blackwell approachability on conic domains. Using regret minimization to solve a Blackwell approachability game. Abernethy et al.’s construction. Predictive Blackwell approachability. |
||
9 |
10/5 |
Predictive regret matching and predictive regret matching
plus Connections between follow-the-regularized-leader / online mirror descent and regret matching / regret matching plus. Construction of predictive regret matching and predictive regret matching plus. |
||
10 |
10/7 |
Monte-Carlo CFR and offline optimization methods for two-player zero-sum games Regret minimization with unbiased estimators of the utilities. Game-theoretic utility estimators (external sampling, outcome sampling). Offline optimization methods for two-player zero-sum games. Accelerated first-order saddle-point solvers (excessive gap technique, mirror prox). Linear programming formulation of Nash equilibrium strategies. Payoff matrix sparsification technique. |
|
|
11 |
10/12 |
Game abstraction 1: Practical state of the art Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction. |
||
12 |
10/19 |
Game abstraction 2 Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling. |
||
13 |
10/21 |
State-of-the-art for two-player no-limit Texas hold’em: Libratus Subgame solving in imperfect-information games. Self-improver. Time allowing: Deep CFR as an alternative to abstraction, and Supremus improvements to DeepStack. |
||
14 |
10/26 |
State-of-the-art for multi-player no-limit Texas hold’em: Pluribus Depth-limited subgame solving. |
||
15 |
10/28 |
“Endgame”
solving without a blueprint strategy: ReBeL Guest lecture by Noam Brown. |
||
16 |
11/2 |
Simulator-access games 1: State-of-the-art for StarCraft: AlphaStar Guest lecture by
Wojtek Czarnecki from DeepMind, who worked on the multiagent aspects of AlphaStar. |
||
17 |
11/4 |
Simulator-access games 2: (Near)optimality certificates
and how to learn them Guest lecture by Brian Zhang. |
||
18 |
11/9 |
Opponent exploitation Different approaches to opponent exploitation. Hybrids with game theory. Safe opponent exploitation. |
||
19 |
11/11 |
Equilibrium refinements Sequential irrationality. Trembling-hand equilibrium refinements: quasi-perfect equilibrium (QPE) and extensive-form perfect equilibrium (EFPE). Relationships among refinements. Computational complexity. Trembling-hand linear program formulation of QPE and EFPE. Scalable exact algorithms for QPE, and EFPE. |
||
20 |
11/16 |
Correlated strategies and team coordination Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but only the vertices are known and not the constraints; ways around that in practice. |
||
21 |
11/18 |
Course project presentations Presentations
from: David, Keegan, Kevin. |
||
22 |
11/23 |
Course project presentations Presentations
from: Neil, Wan, Alex. |
||
23 |
11/30 |
Course project presentations Presentations
from: Brian, Ioannis, Carmel. |
||
24 |
12/2 |
Course project presentations Presentations
from: Justin, Anita, Gokul. |
Professor |
Title |
Year |
University |
Christian Kroer |
2020 |
Columbia |
|
Tuomas Sandholm |
2015 |
CMU |
|
John P. Dickerson |
2018 |
UMD |
|
Fei Fang |
2018 |
CMU |
|
Constantinos Daskalakis |
2015 |
MIT |
|
Yiling Chen |
Topics at the Interface between Computer Science and
Economics |
2016 |
Harvard |
Vincent Conitzer |
Computational Microeconomics: Game Theory, Social Choice, and
Mechanism Design |
2016 |
Duke |
Sanmay Das |
2016 |
Wash U |
|
Ariel Procaccia |
2016 |
CMU |
|
Milind Tambe |
2016 |
USC |
|
Tim Roughgarden |
2013 |
Stanford |