Tuesday, Oct 29nd, 2019. 12:00 PM. NSH 3305
Yuandong Tian -- Over-parameterization as a Catalyst in Generalization of Deep ReLU networks via Student-Teacher Setting
Abstract: In this talk, we study the generalization behaviors of deep networks at interpolation region, where the training error and the gradient at each training data point is small. We use a teacher-student setting: both student and teacher are deep ReLU networks and a student learns from the output of a fixed teacher with SGD. Our conclusion is two-fold. First, with minimal assumptions on the teacher network and the training set, we prove that small gradient at each training data point leads to weight alignment between teacher and student networks at the lowest layer, if both the student and the teacher are. Furthermore, from the proof, over-parameterization makes such alignment more likely to happen. Second, further analysis of the training dynamics shows that student network learns the strong teacher nodes first, leaving weak teacher node unexplained until late stage of the training, and over-parameterization can help cover more weaker node with the same number of iterations. This sheds light on the puzzling phenomena that low training error and over-parameterization could lead to good generalization.
Bio: Yuandong Tian is a Research Scientist and Manager in Facebook AI Research, working on deep reinforcement learning and its applications, and theoretical analysis of deep models. He is the lead scientist and engineer for ELF OpenGo and DarkForest Go project. Prior to that, he was a researcher and engineer in Google Self-driving Car team in 2013-2014. He received Ph.D in Robotics Institute, Carnegie Mellon University on 2013, Bachelor and Master degree of Computer Science in Shanghai Jiao Tong University. He is the recipient of 2013 ICCV Marr Prize Honorable Mentions.