Computational Game Solving

Fall 2024

Focus

The course will focus on multi-step imperfect-information games because most real-world strategic settings are such games. Such games beget additional issues beyond perfect-information games like chess and Go, such as signaling, deception, and understanding deception by others. There has been tremendous progress in the AI community on solving such games since around 2003. This course covers the fundamentals and the state of the art of solving such games.

Course logistics

Instructors

Office

Email

Prof. Tuomas Sandholm

GHC 9205

sandholm AT cs

Brian Hu Zhang

GHC 9011

bhzhang AT cs

 

·       Lecture time: Mondays & Wednesdays 11:00 AM - 12:20 PM. No class on 9/2, 10/14, 10/16, or 11/27 due to University Holidays.

·       Lecture location: Wean Hall 4625

·       Office hours:

o   Tuomas: Mondays after class, 12:20 PM – 1:00 PM in GHC 9205, except 9/2, 10/14, and 11/4

o   Brian: Fridays, 2:00 PM – 3:00 PM in GHC 9011 at the Gates 5th floor tables by the bridge, except 10/18 and 11/29

Course structure and evaluation

The course will be lecture based. At the end of the course there will be a few lectures of project presentations by students.

Readings will consist of a mixture of papers and course notes.

Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.

Grading will be as follows:

·       50% final project

·       40% homework sets (there will be 2 homework sets that may include both paper-and-pen questions and programming assignments)

·       10% completion of readings, attendance, and participation in class discussions

Please turn in your project proposal via email on Gradescope by Wednesday, Oct 2. Project proposal guidelines can be found here.

Please turn in your project writeup via email on Gradescope by Friday, Dec 6.

Homework sets

Released

Due

Files

Homework 1

Sep 12

Sep 30

hw1.pdf    hw1.zip

Homework 2

Oct 7

Oct 28

hw2.pdf    hw2.zip

 

On both HW1 and HW2, we will allow late submissions with a penalty of 5 points per day, rounded up to the nearest whole day, up to a maximum of 5 days (-25 points), after which we will no longer allow submissions. This late policy does not apply to project deadlines (proposal or final report), which we will not accept late. Your project proposals should be seen as a chance for you to get feedback from us on your project idea—the earlier that happens, the more useful it is. The final report deadline (Dec 6) cannot be moved later due to logistical constraints (we need time to read them all!)

Schedule (subject to change)

Lecture

Date

Topic

Reading(s)

Lecture slides

1

M 8/26

Introduction

Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games.

Slides

2

W 8/28

Perfect-information games 1

Tree search methods for two-player perfect-information games: minmax search, alpha-beta pruning, iterative deepening, quiescence search, singular extension, evaluation function learning, endgame databases, horizon problem, search depth pathology, chess.

Slides

3

W 9/4

Perfect-information games 2

Monte Carlo Tree Search (MCTS). AlphaGo and AlphaGo Zero.

[Silver et al., Nature’17]

Slides

4

M 9/9

Normal-form games

Review of solution concepts. LP formulation of zero-sum normal-form Nash equilibrium computation. Fictitious play and follow the regularized leader (FTRL). Regret. Multiplicative weights (MWU). Regret matching (RM) and RM+. Optimism. Coarse-correlated equilibria.

[Section 4.1 and 4.6 of MAS]
[Section 4.2-3 of AGT]

Slides

5

W 9/11

Extensive-form games 1: Counterfactual regret minimization

Extensive-form games. Behavioral representation of a strategy. Sequence-form representation and sequence-form LP. Kuhn's theorem: relationship between normal-form and sequence-form strategies. Construction of CFR and proof of correctness and convergence speed.

[Section 5.2.3 of MAS]
[Section 3.10-11 of AGT]

[Zinkevich et al., NIPS’07]
[Farina et al., ICML’19]
[F21 Lecture Notes]

Slides

6

M 9/16

Extensive-form games 2: CFR speedups

Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies. Optimistic regret minimization algorithms.

[Brown & Sandholm, ICML’17]
[Brown & Sandholm, AAAI’19]
[Farina, Kroer, and Sandholm AAAI’21]

Slides

7

W 9/18

Learning in general-sum games beyond coarse-correlated equilibria

The GGM framework, and Blum-Mansour as a special case.

[Blum & Mansour, JMLR’07]

[Gordon, Greenwald, and Marks, ICML’08]

Slides

8

M 9/23

 Extensive-form games 3: Φ-regret in extensive-form games

Φ-regret and its connection to types of correlated equilibrium. Linear-swap correlated equilibria as a special case. Representing linear deviations. Expected fixed points and nonlinear deviations. The TreeSwap algorithm for swap regret in general polytopes.

[Zhang, Farina, and Sandholm, ICLR’24]

[Zhang et al., arXiv’24]

 

TreeSwap: [Dagan et al., STOC’24]  

             or [Peng & Rubinstein, STOC’24]

Slides

9

W 9/25

Beyond two-player zero-sum games and faster learning dynamics

Guest Lecture by Ioannis Anagnostides.

[Anagnostides et al., ICML’22]

Slides

10

M 9/30

Game abstraction 1: Practical state of the art

Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction.

[Brown et al., AAMAS’15]

Slides

11

W 10/2

Game abstraction 2

Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling.

[Kroer & Sandholm, NeurIPS’18]

Slides

12

M 10/7

State-of-the-art for two-player no-limit Texas hold’em: Libratus

Subgame solving in imperfect-information games. Self-improver.

[Brown & Sandholm, Science’18]

Slides

13

W 10/9

Knowledge-limited subgame solving (KLSS)

Limits of subgame solving with common knowledge. Subgame solving without common knowledge. Dark chess. Safe KLSS.

[Zhang and Sandholm, NeurIPS’21]
[Liu et al., ICML’23]

Slides

14

M 10/21

State-of-the-art for multi-player no-limit Texas hold’em: Pluribus

Depth-limited subgame solving.

[Brown & Sandholm, Science’19]

Slides

15

W 10/23

Deep learning in game solving 1

Monte Carlo CFR (MCCFR), and sampling approaches. Deep CFR as an alternative to abstraction. DREAM, ESCHER.

[Lanctot et al., NeurIPS’09]

[Brown et al., ICML’19]
[McAleer et al., ICLR’23]

Slides

16

M 10/28

Deep learning in game solving 2

Double oracle (DO), policy space response oracles (PSRO), extensive-form double oracle (XDO), anytime PSRO, self-play PSRO, diversity in PSRO. AlphaStar and OpenAI Five

[Heinrich and Silver, Arxiv’16]

[Vinyals et al., Nature’19]
[OpenAI et al., 2019]

[Lanctot et al., NeurIPS’17]
[McAleer et al., NeurIPS’21]
[McAleer et al., ICLR’24]

Slides

17

W 10/30

Deep learning in game solving 3

DeepNash for expert-level Stratego: State of the art for Stratego. Magnetic Mirror Descent. Guest lecture by Stephen McAleer (OpenAI).

[Perolat et al., Science’22]
[Perolat et al., ICML’21]
[Sokota et al., ICLR’23]

Slides

18

M 11/4

(Near)optimality certificates and how to learn them even in simulator-access (blackbox) games.

Efficient theoretical algorithms for computing and learning certifiable equilibria in black-box games..

[Zhang & Sandholm, AAAI’21]
[Kozuno et al., NeurIPS’21]

Slides

19

W 11/6

Team games

Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but hard to represent; ways around that in practice. Team DAG and Team PSRO.

[Farina et al., NeurIPS’18]
[Zhang, Farina, and Sandholm, ICML’23]

[McAleer et al., NeurIPS’23]

[Zhang et al., EC’24]

Slides

20

M 11/11

Automated (general!) mechanism design (1/2)

Stackelberg equilibria, correlated equilibria, mechanism design, and information design. Optimal (revenue-maximizing) auctions.  

[Myerson, Math of OR’81]

[Kamenica and Gentzkow AER’11]

Slides

21

W 11/13

Automated (general!) mechanism design (2/2)

Revelation principle. Optimal (e.g., revenue-maximizing) mechanisms. Information design and its relation to correlated equilibria. General mechanisms and communication equilibria, and efficient algorithms for these problems.

[Conitzer and Sandholm, EC’04]

[Dütting et al., ICML’19]

[Zhang et al., NeurIPS’23]

Slides

22

M 11/18

Q&A (first 50 mins); course project presentation (1/5) (last 25 mins)

Course project: Russell

 

23

W 11/20

Course project presentations (2/5)

Course projects: Yixuan, Jingxun, Verity

24

M 11/25

Course project presentations (3/5)

Course projects: Alex, Yuxin, Nairit

25

M 12/2

Course project presentations (4/5)

Course projects: Gabe, Sirui, Hong

26

W 12/4

Course project presentations (5/5)

Course projects: Eric/Shiva/Shawn, Henry/Korinna/George

 

 

Related courses

Professor

Title

Year

University

Tuomas Sandholm and Stephen McAleer

Computational Game Solving

2023

CMU

Tuomas Sandholm and Gabriele Farina

Computational Game Solving

2021

CMU

Christian Kroer

Economics, AI, and Optimization

2020

Columbia

Tuomas Sandholm

Foundations of Electronic Marketplaces

2015

CMU

John P. Dickerson

Applied Mechanism Design for Social Good

2018

UMD

Fei Fang

Artificial Intelligence Methods for Social Good

2018

CMU

Constantinos Daskalakis

Games, Decision, and Computation

2015

MIT

Yiling Chen

Topics at the Interface between Computer Science and Economics

2016

Harvard

Vincent Conitzer

Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design

2016

Duke

Sanmay Das

Multi-Agent Systems

2016

Wash U

Ariel Procaccia

Truth, Justice, and Algorithms

2016

CMU

Milind Tambe

Security and Game Theory

2016

USC

Tim Roughgarden

Algorithmic Game Theory

2013

Stanford