Tuesday, September 20, 2016. 12:00PM. NSH 3305.
Noam Brown - Reduced Space and Faster Convergence in Imperfect-Information Games via Regret-Based Pruning
Counterfactual Regret Minimization (CFR) is a leading algorithm for solving large zero-sum imperfect-information games. CFR is an iterative algorithm that repeatedly traverses the game tree, updating regrets at each information set. We introduce Regret-Based Pruning (RBP), an improvement to CFR that prunes any path of play in the tree, and its descendants, that has negative regret. It revisits that sequence at the earliest subsequent CFR iteration where the regret could have become positive, had that path been explored on every iteration. We prove that in zero-sum games it asymptotically prunes any action that is not part of a best response to some Nash equilibrium. This leads to provably faster convergence and lower space requirements. Experiments show that RBP can result in an order of magnitude reduction in both space and time needed to converge, and that the reduction factor increases with game size.
In partial fulfillment of the Speaking Requirement.