Tuesday, March 14, 2017. 12:00PM. NSH 3305.
Aaditya Ranmdas - Multi A(rmed)/B(andit) Testing with online FDR control
Abstract: We propose a new framework as an alternative to existing setups for controlling false alarms across multiple A/B tests; it combines ideas from pure exploration for best-arm identification in multi-armed bandits (MAB), with online false discovery rate (FDR) control. This framework has various applications, including pharmaceutical companies testing a control pill against a few treatment options, to internet companies testing their current default webpage (control) versus many alternatives (treatment). Our setup involves running a possibly infinite sequence of best-arm MAB instances, and controlling the overall FDR of the process in a fully online manner. Our main contributions are: (i) to propose reasonable definitions for a null hypothesis; (ii) to demonstrate how one can derive an always-valid sequential p-value for such a null hypothesis which allows users to continuously monitor and stop any running MAB instance at any time; and (iii) to embed MAB instances within online FDR algorithms in a way that allows setting MAB confidence-levels based on FDR rejection thresholds. In addition, we adapt existing theory from both the MAB and online FDR literature to ensure that our framework comes with strong sample-optimality guarantees, as well as control of the power and (a modified) FDR at any time. We run extensive simulations to verify our claims and report results on real data collected from the New Yorker Cartoon Caption contest.
Joint work with Fan Yang, Kevin Jamieson, Martin Wainwright.
Bio: Aaditya Ramdas is a postdoctoral researcher in Statistics and EECS at UC Berkeley, advised by Michael Jordan and Martin Wainwright. He finished his PhD in Statistics and Machine Learning at CMU, advised by Larry Wasserman and Aarti Singh. A lot of his research focuses on modern aspects of reproducibility in science and technology -- involving statistical testing and false discovery rate control in static and dynamic settings.