Computational Game Solving

Fall 2023

Focus

The course will focus on multi-step imperfect-information games because most real-world strategic settings are such games. Such games beget additional issues beyond perfect-information games like chess and Go, such as signaling, deception, and understanding deception by others. There has been tremendous progress in the AI community on solving such games since around 2003. This course covers the fundamentals and the state of the art of solving such games.

Course logistics

Instructors

Office

Email

Prof. Tuomas Sandholm

GHC 9205

sandholm AT cs

Stephen McAleer

GHC 9114

smcaleer AT cs

 

·       Lecture time: Mondays & Wednesdays 11:00 AM - 12:20 PM. No class on 9/4, 10/16, 10/18, or 11/22 due to University Holidays.

·       Lecture location: Wean Hall 4625

·       Office hours: Tuomas: after class Monday GHC 9205; Stephen: after class Wednesday GHC 9114.

Course structure and evaluation

The course will be lecture based. At the end of the course there will be a few lectures of project presentations by students.

Readings will consist of a mixture of papers and course notes.

Students will complete a final project. It may be done individually or in groups of 2-3 students. It can be theoretical or experimental, or a combination. The students can pick their course project topic subject to instructor approval of the project proposal. The project can be related to the student’s normal research, but it cannot be the student’s normal research itself. We encourage creativity in the course projects.

Grading will be as follows:

·       50% final project

·       40% homework sets (there will be 2 homework sets that may include both paper-and-pen questions and programming assignments)

·       10% completion of readings, attendance, and participation in class discussions

Please turn in your project proposal via email by Wednesday, Oct 4th.

Please turn in your project writeup via email by Friday, Dec 8th.

Homework sets

Released

Due

Files

Homework 1

Sept. 18

Oct. 2 (beginning of class) or Oct. 6 for a 20% penalty

hw1.pdf    hw1.zip   

Homework 2

Oct. 6

Oct. 23 (beginning of class)

hw2.pdf    hw2.zip

 

Schedule (subject to change)

Lecture

Date

Topic

Reading(s)

Lecture notes

1

8/28

Introduction

Course organization. Introduction to game theory. Game representations. Normal form, extensive form. Solution concepts. Properties of 2-player 0-sum games.

Slides

2

8/30

Perfect-information games

Tree search methods for two-player perfect-information games. AlphaGo and AlphaZero.

[Silver et al., Nature’17]

Slides

3

9/6

Imperfect-information games 1

Review of solution concepts. LP formulation. Fictitious play and follow the regularized leader (FTRL). Regret. Multiplicative weights (MWU). Regret matching (RM) and RM+. Optimism.

[Section 4.1 and 4.6 of MAS]
[Section 4.2-3 of AGT]

Slides

4

9/11

Imperfect-information games 2

Markov decision processes (MDPs) and introduction to reinforcement learning. Markov games and partially-observable Markov games. Extensive-form games. Behavioral representation of a strategy. Sequence-form representation. Kuhn's theorem: relationship between normal-form and sequence-form strategies. Bottom-up decomposition of the sequence-form polytope.

[Section 5.2.3 of MAS]
[Section 3.10-11 of AGT]

Slides

5

9/13

Counterfactual regret minimization

Treeplex case: regret circuits for Cartesian products and for convex hull. Construction of CFR and pseudocode; proof of correctness and convergence speed. Alternation. Hard instances. Linear averaging.

[Zinkevich et al., NIPS’07]
[Farina et al., ICML’19]
[F21 Lecture Notes]

Slides

6

9/18

Provably correct techniques for speeding up CFR algorithms

Alternation. Reweighted updates of regrets and strategies, LCFR, DCFR. Dynamic pruning in imperfect-information games. Warm starting from given strategies. Optimistic regret minimization algorithms.

[Brown & Sandholm, ICML’17]
[Brown & Sandholm, AAAI’19]
[Farina, Kroer, and Sandholm AAAI’21]

Slides

7

9/20

Game abstraction 1: Practical state of the art

Lossless abstraction: GameShrink. Lossy state abstraction. Potential-aware, earth-mover-distance abstraction.

[Brown et al., AAMAS’15]

Slides

8

9/25

Game abstraction 2

Abstraction algorithm for distributed equilibrium finding. Action-abstraction algorithms. Reverse mapping. Abstraction pathology. Lossy abstraction with solution-quality bounds. Application of the theory to game modeling.

[Kroer & Sandholm, NeurIPS’18]

Slides

9

9/27

Multiagent learning dynamics

Guest Lecture by Ioannis Anagnostides.

[Anagnostides et al., ICML’22]

Slides

10

10/2

State-of-the-art for two-player no-limit Texas hold’em: Libratus

Subgame solving in imperfect-information games. Self-improver.

[Brown & Sandholm, Science’18]

Slides

11

10/4

State-of-the-art for multi-player no-limit Texas hold’em: Pluribus

Depth-limited subgame solving.

[Brown & Sandholm, Science’19]

Slides

12

10/9

Knowledge-limited subgame solving (KLSS), safe KLSS, and state of the art in dark chess.

Guest lecture by Brian Zhang.

[Zhang and Sandholm, NeurIPS’21]
[Liu et al., ICML’23]

Slides

13

10/11

Deep learning in tree-based game solving 1

Deep CFR as an alternative to abstraction. Neural Fictitious Play [Heinrich et al. 2016], multiagent policy gradient methods.

[Heinrich and Silver, Arxiv’16]
[Brown et al., ICML’19]

Slides

14

10/23

Deep learning in tree-based game solving 2

Loss estimation, Monte Carlo CFR (MCCFR), and sampling approaches. DREAM, ESCHER, Neural Replicator Dynamics.

[Lanctot et al., NeurIPS’09]
[McAleer et al., ICLR’23]

Slides

15

10/25

Deep learning in tree-based game solving 3

DeepNash for expert-level Stratego: State of the art for Stratego. Magnetic Mirror Descent.

[Perolat et al., Science’22]
[Perolat et al., ICML’21]
[Sokota et al., ICLR’23]

Slides

16

10/30

Deep learning in tree-based game solving 4: State-of-the-art for Dota and Starcraft.

AlphaStar and OpenAI 5.

[Vinyals et al., Nature’19]
[OpenAI et al., 2019]

Slides

17

11/1

State-of-the-art in double oracle algorithms

Double oracle (DO), policy space response oracles (PSRO), extensive-form double oracle (XDO), anytime PSRO, self-play PSRO, diversity in PSRO.

[Lanctot et al., NeurIPS’17]
[McAleer et al., NeurIPS’21]
[McAleer et al., 2022]

Slides

18

11/6

(Near)optimality certificates and how to learn them even in simulator-access (blackbox) games.

Guest lecture by Brian Zhang.

[Zhang & Sandholm, AAAI’21]

Slides

19

11/8

State-of-the-art in Diplomacy.

Guest lecture by Noam Brown (OpenAI).

[Meta Fundamental AI Research Diplomacy Team (FAIR)†, et al., Science’22]

20

11/13

Team games

Team maxmin equilibrium and TMECor; why the latter is often significantly better. Realization polytope: low dimensional but only the vertices are known and not the constraints; ways around that in practice. Team PSRO.

[Farina et al., NeurIPS’18]
[Zhang & Sandholm, arXiv’21]

Slides

21

11/15

Opponent exploitation

Different approaches to opponent exploitation. Hybrids with game theory. Safe opponent exploitation.

[Ganzfried & Sandholm, AAMAS’11]
[Ganzfried & Sandholm, TEAC’15]

Slides

22

11/20

Course project presentations

No class: optional project help in Tuomas' office.

23

11/27

Course project presentations

Carlos; Tobias and Patrick

24

11/29

Course project presentations

Naifeng; Arnhav and Abishek

25

12/4

Course project presentations

Harrison, Gil, and Cynthia; Emin

26

12/6

Course project presentations

May; Mingkuan; Richard and John

 

Related courses

Professor

Title

Year

University

Tuomas Sandholm and Gabriele Farina

Computational Game Solving

2021

CMU

Christian Kroer

Economics, AI, and Optimization

2020

Columbia

Tuomas Sandholm

Foundations of Electronic Marketplaces

2015

CMU

John P. Dickerson

Applied Mechanism Design for Social Good

2018

UMD

Fei Fang

Artificial Intelligence Methods for Social Good

2018

CMU

Constantinos Daskalakis

Games, Decision, and Computation

2015

MIT

Yiling Chen

Topics at the Interface between Computer Science and Economics

2016

Harvard

Vincent Conitzer

Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design

2016

Duke

Sanmay Das

Multi-Agent Systems

2016

Wash U

Ariel Procaccia

Truth, Justice, and Algorithms

2016

CMU

Milind Tambe

Security and Game Theory

2016

USC

Tim Roughgarden

Algorithmic Game Theory

2013

Stanford