Announcement
Homework 3 is posted. Due 11/16 in class.
Homework 2 is posted. Due 10/19 in class.
Homework 1 is posted. Due 9/21 in class.
Room change! The lectures are moved to Physics 299.
Instructor
Rong
Ge
D226 LSRC
Email: rongge@cs.duke.edu
Lectures: Monday and Wednesday 10:05  11:20 AM, Physics 299.
Textbook: Ankur Moitra's monograph. Lecture notes for new topics will be provided.
Prerequisites: Familiarity with algorithms, probability and linear algebra.
Evaluation:
The course grades will be determined based on:
Date  Contents  References  Future Reading 
8/29  Intro: Machine Learning Problems  Slides  
Nonnegative Matrix Factorization and Topic Models  
8/31  New algorithms for separable NMF  Notes, Section 2.3 in textbook  Section 2.12.2 in textbook [3] 
9/5  Topic Models  Notes, Section 2.4 in textbook  [4] 
9/7  Implementing the Topic Modeling Algorithm  Notes [5] Slides  See links in the slides 
[1]
D. Lee and S. Seung. Learning
the Parts of Objects by Nonnegative Matrix Factorization,
Nature 1999. [2] S. Vavasis.On the Complexity of Nonnegative Matrix Factorization, SIOPT 2009. [3] S. Arora, R. Ge, R. Kannan and A. Moitra.Computing a Nonnegative Matrix Factorization  Provably, STOC 2012. [4] S. Arora, R. Ge and A. Moitra. Learning Topic Models  Going Beyond SVD, FOCS 2012. [5] S. Arora et al. A Practical Algorithm for Topic Modeling with Provable Guarantees, ICML 2013. 

Matrix and Spectral Techniques  
9/12  Singular Value
Decompositions, Power Method 
Notes  
9/14  Matrix Perturbations, Matrix Concentration 
Notes 
[1] 
9/19  Stochastic block model, McSherry's algorithm  [2] Notes  
[1] G.W. Stewart and J.G.
Sun, Matrix
Perturbation Theory [2] F. McSherry. Spectral Partitioning of Random Graphs, FOCS 2001 

Tensor Decompositions  
9/21 
Tensor basics  Notes Section 3.1 in textbook  
9/26  Tensor Decompositions  Notes Section 3.1 in textbook  Section 3.2 in textbook 
9/28  Power Method  [4] Notes  [4] 
10/3  Manipulating Moments: topic models, mixture of Gaussians  Notes Section 3.5 in textbook  [3] 
10/5  Handling Asymmetric Tensors: Hidden Markov Models 
Notes Sections 3.3, 3.4 in textbook  [2] 
[1]
C.
Hillar and L. Lim. Most Tensor Problems
are NPhard, JACM 2013.
[2] E. Mossel and S. Roch. Learning Nonsingular Phylogenies and Hidden Markov Models, STOC 2005. [3] A. Anandkumar, D. Foster, D. Hsu, S. Kakade and Y. Liu A Spectral Algorithm for Latent Dirichlet Allocation, NIPS 2012. [4] A. Anandkumar, R. Ge, D. Hsu, S. Kakade, M. Telgarsky. Tensor decompositions for learning latent variable models, JMLR 2014. [5] N. Goyal, S. Vempala and Y. Xiao. Fourier PCA, STOC 2014. 

10/10  Fall break  
10/12  Project: Groups and Topics  Slides  
(nonconvex) Optimization  
10/17  Basic optimization techniques  Notes  [1] [2] 
10/19  Stochastic Gradient and Variance Reduction  Notes 
[2] [3] 
10/24  Approximate Gradient Descent  [4] Section 2, 4 Notes  [4] 
10/26  Saddle Point Problem  Guest Lecture by Chi 
Blog Post 
10/31  Sparse Coding 
Notes 

11/2  Matrix Completion  Notes  [6] 
[1] Y. Nesterov. Introductory
Lectures on Convex Optimization. [2] A. Rakhlin, O. Shamir, K. Sridharan. Making gradient descent optimal for strongly convex stochastic optimization, ICML 2012. [3] R. Johnson, T. Zhang. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction, NIPS 2013 [4] S. Arora, R. Ge, T. Ma and A. Moitra. Simple, Efficient and Neural Algorithms for Sparse Coding, COLT 2015. [5] R. Ge, F. Huang, C. Jin, Y. Yuan. Escaping from SaddlePoints  Online Stochastic Gradient for Tensor Decomposition, COLT 2015. [6] R. Ge, J. Lee, T. Ma. Matrix Completion has No Spurious Local Minimum, NIPS 2016 

Neural Networks and Deep Learning  
11/7  Neural Networks and Backpropagation.  Notes  
11/9  Representation Power I: Representer Theorems  [1] Notes  
11/14  Prepresentation Power II: Lowerbounds  [2] Notes  [3] [4] 
11/16  Random Neural Networks  [5] Notes  
[1]
A. Barron. Universal approximation bounds for superpositions of
a sigmoidal function. 1993 [2] M. Telgarsky. Benefits of depth in neural networks. COLT 2016 [3] R. Elden and O. Shamir. The Power of Depth for Feedforward Neural Networks. COLT 2016 [4] N. Cohen, O. Sharir and A. Shashua. On the Expressive Power of Deep Learning: A Tensor Analysis. COLT 2016 [5] S. Arora, A. Bhaskara, R. Ge, T. Ma. Provable Bounds for Learning Some Deep Representations. ICML 2014 

11/21  Summary: what we learned and what we didn't cover  Slides  
11/23  Thanksgiving  
11/28  Project Presentations  
11/30 
Project Presentations 