Computational Game Solving

Fall 2023

Focus

The course will focus on multi-step imperfect-information games because most real-world strategic settings are such games. Such games beget additional issues beyond perfect-information games like chess and Go, such as signaling, deception, and understanding deception by others. There has been tremendous progress in the AI community on solving such games since around 2003. This course covers the fundamentals and the state of the art of solving such games.

Course logistics

Instructors	Office	Email
Prof. Tuomas Sandholm	GHC 9205	sandholm AT cs
Stephen McAleer	GHC 9114	smcaleer AT cs

· Lecture time: Mondays & Wednesdays 11:00 AM - 12:20 PM. No class on 9/4, 10/16, 10/18, or 11/22 due to University Holidays.

· Lecture location: Wean Hall 4625

· Office hours: Tuomas: after class Monday GHC 9205; Stephen: after class Wednesday GHC 9114.

Course structure and evaluation

The course will be lecture based. At the end of the course there will be a few lectures of project presentations by students.

Readings will consist of a mixture of papers and course notes.

Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.

Grading will be as follows:

· 50% final project

· 40% homework sets (there will be 2 homework sets that may include both paper-and-pen questions and programming assignments)

· 10% completion of readings, attendance, and participation in class discussions

Please turn in your project proposal via email by Wednesday, Oct 4^th.

Please turn in your project writeup via email by Friday, Dec 8^th.

Homework sets

	Released	Due	Files
Homework 1	Sept. 18	Oct. 2 (beginning of class) or Oct. 6 for a 20% penalty	hw1.pdf hw1.zip
Homework 2	Oct. 6	Oct. 23 (beginning of class)	hw2.pdf hw2.zip

Schedule (subject to change)

Lecture	Date	Topic	Reading(s)	Lecture notes
1	8/28	Introduction Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games.		Slides
2	8/30	Perfect-information games Tree search methods for two-player perfect-information games. AlphaGo and AlphaZero.	[Silver et al., Nature’17]	Slides
3	9/6	Imperfect-information games 1 Review of solution concepts. LP formulation. Fictitious play and follow the regularized leader (FTRL). Regret. Multiplicative weights (MWU). Regret matching (RM) and RM+. Optimism.	[Section 4.1 and 4.6 of MAS] [Section 4.2-3 of AGT]	Slides
4	9/11	Imperfect-information games 2 Markov decision processes (MDPs) and introduction to reinforcement learning. Markov games and partially-observable Markov games. Extensive-form games. Behavioral representation of a strategy. Sequence-form representation. Kuhn's theorem: relationship between normal-form and sequence-form strategies. Bottom-up decomposition of the sequence-form polytope.	[Section 5.2.3 of MAS] [Section 3.10-11 of AGT]	Slides
5	9/13	Counterfactual regret minimization Treeplex case: regret circuits for Cartesian products and for convex hull. Construction of CFR and pseudocode; proof of correctness and convergence speed. Alternation. Hard instances. Linear averaging.	[Zinkevich et al., NIPS’07] [Farina et al., ICML’19] [F21 Lecture Notes]	Slides
6	9/18	Provably correct techniques for speeding up CFR algorithms Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies. Optimistic regret minimization algorithms.	[Brown & Sandholm, ICML’17] [Brown & Sandholm, AAAI’19] [Farina, Kroer, and Sandholm AAAI’21]	Slides
7	9/20	Game abstraction 1: Practical state of the art Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction.	[Brown et al., AAMAS’15]	Slides
8	9/25	Game abstraction 2 Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling.	[Kroer & Sandholm, NeurIPS’18]	Slides
9	9/27	Multiagent learning dynamics Guest Lecture by Ioannis Anagnostides.	[Anagnostides et al., ICML’22]	Slides
10	10/2	State-of-the-art for two-player no-limit Texas hold’em: Libratus Subgame solving in imperfect-information games. Self-improver.	[Brown & Sandholm, Science’18]	Slides
11	10/4	State-of-the-art for multi-player no-limit Texas hold’em: Pluribus Depth-limited subgame solving.	[Brown & Sandholm, Science’19]	Slides
12	10/9	Knowledge-limited subgame solving (KLSS), safe KLSS, and state of the art in dark chess. Guest lecture by Brian Zhang.	[Zhang and Sandholm, NeurIPS’21] [Liu et al., ICML’23]	Slides
13	10/11	Deep learning in tree-based game solving 1 Deep CFR as an alternative to abstraction. Neural Fictitious Play [Heinrich et al. 2016], multiagent policy gradient methods.	[Heinrich and Silver, Arxiv’16] [Brown et al., ICML’19]	Slides
14	10/23	Deep learning in tree-based game solving 2 Loss estimation, Monte Carlo CFR (MCCFR), and sampling approaches. DREAM, ESCHER, Neural Replicator Dynamics.	[Lanctot et al., NeurIPS’09] [McAleer et al., ICLR’23]	Slides
15	10/25	Deep learning in tree-based game solving 3 DeepNash for expert-level Stratego: State of the art for Stratego. Magnetic Mirror Descent.	[Perolat et al., Science’22] [Perolat et al., ICML’21] [Sokota et al., ICLR’23]	Slides
16	10/30	Deep learning in tree-based game solving 4: State-of-the-art for Dota and Starcraft. AlphaStar and OpenAI 5.	[Vinyals et al., Nature’19] [OpenAI et al., 2019]	Slides
17	11/1	State-of-the-art in double oracle algorithms Double oracle (DO), policy space response oracles (PSRO), extensive-form double oracle (XDO), anytime PSRO, self-play PSRO, diversity in PSRO.	[Lanctot et al., NeurIPS’17] [McAleer et al., NeurIPS’21] [McAleer et al., 2022]	Slides
18	11/6	(Near)optimality certificates and how to learn them even in simulator-access (blackbox) games. Guest lecture by Brian Zhang.	[Zhang & Sandholm, AAAI’21]	Slides
19	11/8	State-of-the-art in Diplomacy. Guest lecture by Noam Brown (OpenAI).	[Meta Fundamental AI Research Diplomacy Team (FAIR)†, et al., Science’22]
20	11/13	Team games Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but only the vertices are known and not the constraints; ways around that in practice. Team PSRO.	[Farina et al., NeurIPS’18] [Zhang & Sandholm, arXiv’21]	Slides
21	11/15	Opponent exploitation Different approaches to opponent exploitation. Hybrids with game theory. Safe opponent exploitation.	[Ganzfried & Sandholm, AAMAS’11] [Ganzfried & Sandholm, TEAC’15]	Slides
22	11/20	Course project presentations No class: optional project help in Tuomas' office.
23	11/27	Course project presentations Carlos; Tobias and Patrick
24	11/29	Course project presentations Naifeng; Arnhav and Abishek
25	12/4	Course project presentations Harrison, Gil, and Cynthia; Emin
26	12/6	Course project presentations May; Mingkuan; Richard and John

Related courses

Professor	Title	Year	University
Tuomas Sandholm and Gabriele Farina	Computational Game Solving	2021	CMU
Christian Kroer	Economics, AI, and Optimization	2020	Columbia
Tuomas Sandholm	Foundations of Electronic Marketplaces	2015	CMU
John P. Dickerson	Applied Mechanism Design for Social Good	2018	UMD
Fei Fang	Artificial Intelligence Methods for Social Good	2018	CMU
Constantinos Daskalakis	Games, Decision, and Computation	2015	MIT
Yiling Chen	Topics at the Interface between Computer Science and Economics	2016	Harvard
Vincent Conitzer	Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design	2016	Duke
Sanmay Das	Multi-Agent Systems	2016	Wash U
Ariel Procaccia	Truth, Justice, and Algorithms	2016	CMU
Milind Tambe	Security and Game Theory	2016	USC
Tim Roughgarden	Algorithmic Game Theory	2013	Stanford