BL-WoLF: A Framework For Loss-Bounded Learnability In Zero-Sum Games
ICML-2003
Vincent Conitzer
I will present a framework for learning in games called BL-WoLF, which can be used to analyze the inherent cost to a player of not knowing the structure of a (zero-sum) game. In our model, the learner initially only knows the family of games that she is in, but not the actual game. We can evaluate any learning strategy by the expected cumulative loss the learner incurs, with a worst-case selection of the game and a worst-case opponent. I will discuss both guaranteed learnability, where the learner must know a minimax strategy for the game after losing a given amount, and nonguaranteed learnability, where the learner's expected cumulative loss is bounded.
Planning in the Presence of Cost Functions Controlled by an Adversary
ICML-2003
Brendan McMahan
We investigate methods for planning in a Markov Decision Process where the cost function is chosen by an adversary after we fix our policy. As a running example, we consider a robot path planning problem where costs are influenced by sensors that an adversary places in the environment. We formulate the problem as a zero-sum matrix game where rows correspond to deterministic policies for the planning player and columns correspond to cost vectors the adversary can select. For a fixed cost vector, fast algorithms (such as value iteration) are available for solving MDPs. We develop efficient algorithms for matrix games where such best response oracles exist. We show that for our path planning problem these algorithms are at least an order of magnitude faster than direct solution of the linear programming formulation.
Charles Rosenberg Last modified: Thu Oct 9 13:26:29 EDT 2003