Tuesday, March 27, 2018. 12:00PM. GHC 6115.
Simon Du -- On the Power of Randomly Initialized Gradient Descent for Learning Convolutional Neural Networks
Abstract: Convolutional neural networks trained by randomly initialized (stochastic) gradient descent have achieved the state-of-art performances in many applications. However, its theoretical properties remain elusive from an optimization point of view. In this talk, I will present two results on explaining the success of gradient descent.
In the first part, I will show under certain structural conditions of the input distribution, randomly initialized gradient descent provably learns a convolutional filter with ReLU activation and average pooling. This is the first recovery guarantee of gradient-based algorithms for learning a convolutional filter on general input distributions.
In the second part of the talk, I will show if the input distribution is Gaussian, then randomly initialized gradient descent with weight-normalization learns a ReLU activated one-hidden-layer convolutional neural network where both the convolutional weights and the output weights are to be optimized. To the best our knowledge, this is the first recovery guarantee of randomly initialized gradient-based algorithms for neural networks that contain more than one layers to be learned.
This talk is based on works with Jason D. Lee, Barnabas Poczos, Aarti Singh and Yuandong Tian.
Bio: Simon Shaolei Du is a PhD student in the Machine Learning Department at the School of Computer Science, Carnegie Mellon University, advised by Professor Aarti Singh and Professor Barnabas Poczos. His research interests broadly include topics in theoretical machine learning and statistics, such as deep learning, matrix factorization, convex/non-convex optimization, transfer learning, reinforcement learning, non-parametric statistics and robust statistics. Currently he is also developing methods for precision agriculture. In 2011, he earned his high school degree from The Experimental High School Attached to Beijing Normal University. In 2015, he obtained his B.S. in Engineering Math & Statistics and B.S. in Electrical Engineering & Computer Science from University of California, Berkeley. He has also spent time working at research labs of Microsoft and Facebook.