Probabilistic Graphical Models
School of Computer Science, Carnegie-Mellon University |
Syllabus and Course
Schedule
|
Module |
Lectures, readings, online materials |
Homeworks, Exams, Deadlines |
|
Module 0: introduction
(1 Lectures)
|
|
|
|
Module 1: Fundamentals of
Graphical Models:
|
Basic representation |
1. Representation of directed GM a. Independencies in directed graphs b. The Bayes-ball algorithm c. I-map d. Tabular CPD e. Equivalence theorem
Reading: |
|
2. Representation of undirected GM a. Independencies in undirected graphs b. I-map c. Factors and potentials d. Hamersley-Clifford theorem e. Parameterization and Over-parameterization, Factored graphs
Reading: |
9/19/07 |
||
3. Unified view of BN and MRF a. The representational difference between BN and MRF b. Moral graph, chordal graph, triangluation, and clique trees c. Partially directed models d. CRF
Recitation: Review of Lectures 1-3, 21st Sept Reading: |
|
||
Exact
Inference (3 Lectures) |
4. Elimination algorithm a. The basic algorithm b. Dealing with algorithm c. Complex d. Ordering e. MAP queries f. Example: HMM
Recitation: Review of Lectures 3-4, 27st Sept Reading: |
|
|
5. Message-passing algorithm on trees and on the original graph a. The algorithm: sum-product and belief update b. Incorrectness of SP on general graph c. Example:
Reading: |
|
||
6. Junction tree algorithm a. Clique tree from variable elimination b. Clique tree from the chordal graph c. The junction tree property d. The calibrate clique tree and a distribution e. The junction tree algorithm
Reading: |
|
||
Learning
I: Bayesian Networks (4~5 Lectures) |
7. Learning one-node GM a. Plates, b. Multinomial and Gaissian c. Density estimation (of discrete and continuous models) d. Maximum likelihood estimators (frequentist approach) e. Bayesian approaches.
Reading: |
|
|
8. Learning two-node GM a. Regression and classification b. Generative models and discriminative models c. Supervised learning
Reading: |
Project description due |
||
9. Learning tabular CPT of structured full BN, a. Parameter independence and Global decomposition b. Local decomposition c. Formal foundation: sufficient statistics and exponential family
Reading: |
|
||
Slides (annotated slides)
10. Structure learning a. Structure scores b. Search c. Chow-Liu algorithm d. Context-specific
independence Recitation: Review of Exponential Families, GLIMs, 18th Oct Reading: |
|
||
|
|
||
11. EM: learning from partially observed data a. Two node graphs with discrete hidden variables; b. Mixture models and clustering problems; c. Unsupervised learning and the EM algorithm.
Reading: |
|
||
Case studies: Popular Bayesian networks and MRF
|
HMM revisit (brief, to be covered fully in recitation) a. Forward-backward b. Viterbi c. Baum-Welsh d. A case study: gene finding from DNA sequence
12. CRF a. Representation b. CRF vs HMM Reading: Lafferty et al Sha et al |
|
|
Multivariate Gaussian Models
13. Factor analysis a. Manipulating multivariate Gaussian b. Some useful linear algebra and matrix calculus c. Factor analysis and the EM algorithm
Reading: |
|
|
|
Temporal models
14. Kalman filter a. RTS algorithm b. The junction tree version of RTS
Reading: |
|||
Slides (annotated slides)
Complex Graphical Models models
16. Overview of intractable popular BNs and MRFs a. Dynamic Bayesian networks, b. Bayesian admixture models (LDA) c. Ising models d. Feature-based models e. General exponential family models f. The need for approximate inference
Reading: fHMMs, , Switching SSM, LDA |
|
||
Approximate Inference
|
Slides (annotated slides)
17. Variational inference 1: a. Theory of loopy belief propagation, b. Bethe free energy, Kikuchi c. Case study on Ising model
Reading: Yedidia et al, 2004 |
|
|
18. Variational inference 2: a. Theory of mean field inference, b. Lower bounds, c. Structured mean field algorithms, d. Variational Bayesian learning, e. The generalized mean field algorithm. f. Case study on LDA
Reading: |
|
||
19. Monte Carlo inference 1: a. Intro to sampling methods, b. Importance sampling, c. Weighted resampling, d. Particle filters.
Reading: |
|
||
|
|||
20. Monte Carlo inference 2: a. Collapsed sampling; b. Markov chain Monte Carlo, c. Metropolis Hasting algorithm, d. Gibbs sampling algorithm, convergence test; e. The data augmentation algorithm and EM.
Reading:
21. Nonparametric Bayesian model a. Dirichlet Process b. Dirichlet process mixtures and infinite mixture models c. Infinite HMM
Reading:
|
|||
Slides (annotated slides)
21. Learning undirected graphical models a. CRFs and general Markov networks b. Iterative scaling learning c. Contrastive divergence algorithms
Reading: |
hw 4 in, |
||
22. Learning in structured input-output space: a. Maximum-margin Markov networks
Reading: |
|
||
Slide (annotated slides)
23. Discussion on discriminative learning objective: a unified view a. Likelihood-based and maximum margin learning, b. Convex optimization c. Learning algorithms
Reading: |
|
||
11/30/07: Poster presentation
of final project
|
|||
Slides (annotated slides)
24. Applications a. b. c.
Reading: |
--Submit
final report
|
||
Final Exam (Take home)
|
All material thus far |
|
Recitation Schedule |
|
|||
Date |
Time |
Place |
Topic |
Sep
21 |
5-6
pm |
WeH
4623 |
Lectures
1-3 |
Sep
27 |
6-7pm |
NSH
3001 |
Lectures
3-4 |
Oct
4 |
6-7
pm |
NSH
3001 |
BP,Junction Tree |
Oct
11 |
6-7
pm |
Weh
4623 |
LMS
|
Oct
18 |
6-7
pm |
NSH 3001 |
Exp
Family, GLIMs |
Oct
25th |
6-7
pm |
NSH
3001 |
|
Nov
1st |
6-7
pm |
NSH
3001 |
|
Nov
8th |
6-7
pm |
NSH
3001 |