GM-schedule

School of Computer Science, Carnegie-Mellon University

Syllabus and Course Schedule

Module

Lectures, readings, online materials

Homeworks, Exams, Deadlines

Module 0: introduction

(1 Lectures)

· Lecture 0: 9/10/07 (Xing)

Slides (annotated slides)

0. Introduction

a. Logistics

b. A broad overview,

c. The running example: hidden Markov model

Reading:

Module 1: Fundamentals of Graphical Models:

basic representation, inference, and learning

Basic representation
(3 Lectures)

Lecture 1: 9/12/07 (Xing)

Slides (annotated slides)

1. Representation of directed GM

a. Independencies in directed graphs

b. The Bayes-ball algorithm

c. I-map

d. Tabular CPD

e. Equivalence theorem

Reading:

Lecture 2: 9/17-19/07 (Xing)

Slides (annotated slides)

2. Representation of undirected GM

a. Independencies in undirected graphs

b. I-map

c. Factors and potentials

d. Hamersley-Clifford theorem

e. Parameterization and Over-parameterization, Factored graphs

Reading:

9/19/07
hw1 out

Lecture 3 9/24/07 (Xing)

Slides (annotated slides)

3. Unified view of BN and MRF

a. The representational difference between BN and MRF

b. Moral graph, chordal graph, triangluation, and clique trees

c. Partially directed models

d. CRF

Recitation: Review of Lectures 1-3, 21st Sept

Reading:

Exact Inference
(3 Lectures)

Lecture 4 9/26/07 (Xing)

Slides (annotated slides)

4. Elimination algorithm

a. The basic algorithm

b. Dealing with algorithm

c. Complex

d. Ordering

e. MAP queries

f. Example: HMM

Recitation: Review of Lectures 3-4, 27st Sept

Reading:

Lecture 5: 10/1/07 (Xing)

Slides (annotated slides)

5. Message-passing algorithm on trees and on the original graph

a. The algorithm: sum-product and belief update

b. Incorrectness of SP on general graph

c. Example:

Reading:

Lecture 6: 10/3/07 (Xing)

Slides (annotated slides)

6. Junction tree algorithm

a. Clique tree from variable elimination

b. Clique tree from the chordal graph

c. The junction tree property

d. The calibrate clique tree and a distribution

e. The junction tree algorithm

Recitation: Review of Max-Product and Junction Tree , 4th Oct

Reading:

Learning I: Bayesian Networks
(4~5 Lectures)

Lecture 7: 10/8/07 (Xing)

Slides (annotated slides)

7. Learning one-node GM

a. Plates,

b. Multinomial and Gaissian

c. Density estimation (of discrete and continuous models)

d. Maximum likelihood estimators (frequentist approach)

e. Bayesian approaches.

Reading:

hw2 out

hw 1 in.

Lecture 8: 10/10/07 (Xing)

Slides (annotated slides)

8. Learning two-node GM

a. Regression and classification

b. Generative models and discriminative models

c. Supervised learning

Recitation: Review of LMS and geometric interpretations, 11th Oct

Reading:

Project description due

Lecture 9: 10/15/07 (Xing)

Slides (annotated slides)

9. Learning tabular CPT of structured full BN,

a. Parameter independence and Global decomposition

b. Local decomposition

c. Formal foundation: sufficient statistics and exponential family

Reading:

Lecture 10: 10/17/07 (Xing)

Slides (annotated slides)

10. Structure learning

a. Structure scores

b. Search

c. Chow-Liu algorithm

d. Context-specific independence

Recitation: Review of Exponential Families, GLIMs, 18th Oct

Reading:

Lecture 10 (continue): 10/22/07 (Xing)

Slides (annotated slides)

hw 3 out

hw 2 in.

Lecture 11: 10/24/07 (Xing)

Slides (annotated slides)

11. EM: learning from partially observed data

a. Two node graphs with discrete hidden variables;

b. Mixture models and clustering problems;

c. Unsupervised learning and the EM algorithm.

Reading:

Case studies: Popular Bayesian networks and MRF

Lecture 12: 10/29/07 (Xing)

Slides (annotated slides)

HMM revisit (brief, to be covered fully in recitation)

a. Forward-backward

b. Viterbi

c. Baum-Welsh

d. A case study: gene finding from DNA sequence

CRF model

12. CRF

a. Representation

b. CRF vs HMM

Reading: Lafferty et al Sha et al

Lecture 13: 10/31/07 (Xing)

Slides (annotated slides)

Multivariate Gaussian Models

13. Factor analysis

a. Manipulating multivariate Gaussian

b. Some useful linear algebra and matrix calculus

c. Factor analysis and the EM algorithm

Reading:

mid term project milestone due

Lecture 14: 11/05/07 (Ramesh/Xing)

Slides (annotated slides)

Temporal models

14. Kalman filter

a. RTS algorithm

b. The junction tree version of RTS

Reading:

Lecture 15: 11/05/07 (Ramesh/Xing)

Slides (annotated slides)

Complex Graphical Models models

16. Overview of intractable popular BNs and MRFs

a. Dynamic Bayesian networks,

b. Bayesian admixture models (LDA)

c. Ising models

d. Feature-based models

e. General exponential family models

f. The need for approximate inference

Reading: fHMMs, , Switching SSM, LDA

Approximate Inference

Lecture 16: 11/7/07 (Hetu/Xing)

Slides (annotated slides)
Eric's slides

17. Variational inference 1:

a. Theory of loopy belief propagation,

b. Bethe free energy, Kikuchi

c. Case study on Ising model

Reading: Yedidia et al, 2004

hw4 out,

hw3 in.

Lecture 17: 11/12-14/07 (Xing)

Slides (annotated slides)

18. Variational inference 2:

a. Theory of mean field inference,

b. Lower bounds,

c. Structured mean field algorithms,

d. Variational Bayesian learning,

e. The generalized mean field algorithm.

f. Case study on LDA

Reading:
Generalized Mean Field
Graphical Models, exponential families, and variational inference
Variational Bayes

Lecture 18: 11/19/07 (Xing)

Slides (annotated slides)

19. Monte Carlo inference 1:

a. Intro to sampling methods,

b. Importance sampling,

c. Weighted resampling,

d. Particle filters.

Reading:

11/21/07 Thanksgiving, no class

Lecture 19: 11/26/07 (Xing)

Slides (annotated slides)

20. Monte Carlo inference 2:

a. Collapsed sampling;

b. Markov chain Monte Carlo,

c. Metropolis Hasting algorithm,

d. Gibbs sampling algorithm, convergence test;

e. The data augmentation algorithm and EM.

Reading:

Lecture 20: 11/26/07 (Xing)

Slides (annotated slides)

21. Nonparametric Bayesian model

a. Dirichlet Process

b. Dirichlet process mixtures and infinite mixture models

c. Infinite HMM

Reading:
A Constructive Definition of Dirichlet Priors
MCMC methods for Dirichlet Process Mixture Models
Variational methods for the Dirichlet Process

Lecture 21: 11/28/07 (Xing)

Slides (annotated slides)

21. Learning undirected graphical models

a. CRFs and general Markov networks

b. Iterative scaling learning

c. Contrastive divergence algorithms

Reading:

hw 4 in,

Lecture 22: 11/28/07 (Xing)

Slides (annotated slides)

22. Learning in structured input-output space:

a. Maximum-margin Markov networks

Reading:

Lecture 22: (Xing) --- canceled

Slide (annotated slides)

23. Discussion on discriminative learning objective: a unified view

a. Likelihood-based and maximum margin learning,

b. Convex optimization

c. Learning algorithms

Reading:

11/30/07: Poster presentation of final project

Lecture 23: 12/3/07 (Xing)

Slides (annotated slides)

24. Applications

Reading:

--Submit final report

--Take-home final out

Final Exam (Take home)

All material thus far

--Take-home final due (12/13/07)

Recitation Schedule


Date	Time	Place	Topic
Sep 21	5-6 pm	WeH 4623	Lectures 1-3
Sep 27	6-7pm	NSH 3001	Lectures 3-4
Oct 4	6-7 pm	NSH 3001	BP,Junction Tree
Oct 11	6-7 pm	Weh 4623	LMS
Oct 18	6-7 pm	NSH 3001	Exp Family, GLIMs
Oct 25th	6-7 pm	NSH 3001	Structure Learning
Nov 1st	6-7 pm	NSH 3001	CRFs and Factor Analysis
Nov 8th	6-7 pm	NSH 3001	Complex Graphical Models