Information Processing and Learning

10-704, Fall 2016

Aarti Singh

Teaching Assistant: Shashank Singh
Class Assistant: Sandra Winkler



Date Lecture Topics (Tentative) Suggested Readings Assignments (Tentative)
Aug 29
Notes

  • Intro
  • Information Content, Entropy

  • Cover-Thomas: 2.1
  • MacKay: 2.4, 4.1
Aug 31
Notes

  • Joint, Conditional and Relative Entropy
  • Connection to Maximum Likelihood Estimation
  • Mutual Information
  • Connection between channel coding and inference
  • Properties of Information theoretic quantities
  • Gibb's inequality
  • Cover-Thomas: 2.2-2.7
  • MacKay: Ch 8
Sept 5
No Class -- Labor Day
Sept 7
Notes

  • Data Processing Inequality
  • Submodularity of Entropy and Mutual Information
  • Application: Sensor Placement
HW 1 released.
QNA 1 released.
Sept 12
Notes

  • Greedy Maximization for Submodular functions
  • Differential Entropy
  • Application: Clustering
  • Estimators for entropy
Sept 14
Notes
  • Estimators for entropy of discrete and continuous random variables
  • Plug-in, Integral, Resubstitution and Splitting Data Estimators
QNA 1 due.
Sept 19
Notes
  • Splitting Data Estimator Analysis via Von Mises Expansion
  • Estimating Mutual Information
  • More Applications to Machine Learning
    • Feature selection via Information gain scores
    • Independence testing
    • Machine Learning on Distributions
    • Structure learning in tree graphical models
Sept 21
Notes
  • Application: Structure learning in general graphical models
  • Maximum entropy density estimation
  • I-projection
QNA 2 released.
Sept 26
Notes
  • MaxEnt Duality
  • Generalized MaxEnt and Regularized Maximum Likelihood
Sept 28
Notes
Siheng's handwritten Notes
  • Entropy Rate of a Stochastic Process
  • Burg's Max Entropy Rate Theorem
  • Source coding basics
  • Source Coding Theorem
  • Cover-Thomas: 4.1, 4.2, 12.5, 12.6
  • Cover-Thomas: 3, 5.1
QNA 2 due.
HW 1 due.
Oct 3
Notes
  • Kraft and McMillan Theorems
  • Huffman Codes
  • Cover-Thomas: 5.1-5.8
Oct 5
Notes
  • Empirical Risk Minimization and Prefix Codes
  • Complexity Penalized ERM via Prefix codes
  • Example: Histogram Classifiers
  • Example: Decision Tree Classifiers
Project Proposal due.
HW 2 released.
Oct 10
Notes
  • Unbounded loss functions and infinite classes
  • Example: Histogram Regression
  • Example: Wavelet De-noising
QNA 3 released.
Oct 12
Notes
  • Example: Markov-chains
  • Minimum Description Length Principle
  • Sequential/Universal Prediction and Universal Coding
  • Minimax Regret and Redundancy
  • Exponential Weights update
Oct 17
Quiz 1
Oct 19
Notes
  • Regret Guarantees for Exponential Weights
  • Minimax and Bayesian Redundancy
  • Redundancy Capacity Theorem
QNA 3 due.
Oct 24
Notes
  • Sequential Prediction with Other losses
  • Loss-based redundancy and regret bounds
Oct 26
Notes
  • Universal Coding
  • Context-Tree-Weighting
  • Arithmetic Coding
  • Sufficient Statistics
QNA 4 released.
HW2 due
(Oct 28)
Oct 31
Notes
  • Information Bottleneck Principle
  • Rate distortion function
  • Blahut-Arimoto Algorithm
  • Information Bottleneck papers 1, 2
  • Cover-Thomas: 10.1-10.3
Nov 2
Notes
  • Rate Distortion Theorem
  • Channel Capacity
  • Channel Coding Theorem
  • Independent Gaussian channels
  • Cover-Thomas: 10.5-10.8
  • Cover-Thomas: 7.1-7.7
  • Cover-Thomas: 9.1, 9.4
QNA 4 due
(Nov 4)
Nov 7
Notes
  • Correlated Gaussian channels
  • Multi-antenna channels (known, random)
  • Average Privacy via Noisy Random Projection
  • Differential Privacy via Noisy Random Projection
Project Midterm Report due.
Nov 9
Notes
  • Differential Privacy via Rate-distortion
  • Converse of Channel coding theorem
  • Minimax Theory and Testing
  • Le Cam's Method
  • Lower bounds on normal means problems
Nov 14
Notes
  • Neyman-Pearson Lemma
  • Global and Local Fano's Method
  • Application to sparse normal mean testing
  • Minimax Lower bounds for Estimation
  • Strong Data Processing Inequalities
QNA 5 released.
Nov 16
Notes
  • Data Processing Inequalities and Minimax Lower Bounds
  • Strong Data Processing under Differential Privacy
  • Strong Data Processing under Compression
  • Strong Data Processing under Communication Constraints
Nov 21
Quiz 2 QNA 5 due
(Nov 22)
Nov 23
No Class -- Thanksgiving Holiday
Nov 28
Project Presentations
HW 3 released.
Nov 30
Project Presentations
Dec 5
Notes
  • Large deviation theory - Sanov's theorem
  • Method of Types
  • Error exponents in Hypothesis testing
  • Cover-Thomas: Ch 11
Project Final Reports due (Dec 4).
Dec 7
Notes
  • Cramer-Rao lower bound
  • Fisher Information
HW3 due
(Dec 12).