Apr 6 Overview |
Zoubin Ghahramani | Graphical
models overview slides Factor Graph Propagation slides whiteboard jpg image |
Apr 13 Variational methods |
Zoubin Ghahramani | An
introduction to variational methods for graphical models Michael Jordan, Zoubin Ghahramani, Tommi Jaakkola, and Lawrence Saul Tutorial on variational approximation methods Tommi Jaakkola J. M. Winn. Variational Message Passing and its Applications. Ph.D. Thesis, Department of Physics, University of Cambridge, 2003 Chapter 2 and Chapter 5 Wiegerinck, W. Variational approximations between mean field theory and the junction tree algorithm , UAI-2000. M. J. Wainwright, and M. I. Jordan. Graphical models, exponential families, and variational inference. UC Berkeley, Dept. of Statistics, Technical Report 649. September, 2003. http://www.eecs.berkeley.edu/~wainwrig/Papers/WaiJorVariational03.ps |
Apr 20 Expectation Propagation (Note: starts 4:30pm, half an hour later than usual) |
Yuan Qi | Intro to EP A family of algorithms for approximate Bayesian inference Thomas Minka Thesis, Ch. 4 Tree EP Tree-structured Approximations by Expectation Propagation, Thomas Minka and Yuan Qi, NIPS2003 EP for dynamic systems Expectation Propagation for Signal Detection in Flat-fading Channels, Yuan Qi and Thomas Minka, in the proceedings of IEEE International Symposium on Information Theory, June, 2003, Yokohama, Japan |
Apr 27 Structure learning (1) |
Ricardo Silva, Anna Goldenberg,
Fan Li |
'Learning Bayesian Networks'
book by Richard Neapolitan A tutorial on learning with bayesian networks David Heckerman Scheines, R, (1997) "An Introduction to Causal Inference", in Causality in Crisis, ed. by Steven Turner and Vaughan McKim, University of Notre Dame Press. Available at http://www.phil.cmu.edu/faculty/spirtes/tetradpapers/notredame.ps (a nice overview of representational issues that are very relevant for structure learning) Learning Bayesian Networks with Local Structure Nir Friedman and Moises Goldszmidt |
May 4 Bayesian Error-Bars for Belief Net Inference (notice room change: NSH1507) |
Russell Greiner |
Russ'
graphical model results A Bayesian Belief Network (BN) models a joint distribution over a set of n variables, using a DAG structure to represent the immediate dependencies between the variables, and a set of parameters (aka "CPTables") to represent the local conditional probabilities of a node, given each assignment to its parents. In many situations, these parameters are themselves random variables --- this may reflect the uncertainty of the domain expert, or may come from a training sample used to estimate the parameter values. The distribution over these "CPtable variables" induces a distribution over the response the BN will return to any "What is Pr(Q=q | E=e)?" query. This paper investigates properties of this response: showing first that it is asymptotically normal, then providing, in closed form, its mean and asymptotic variance. We then present an effective general algorithm for computing this variance, which has the same complexity as simply computing (the mean value of) the response itself --- ie, O(n 2^w), where w is the effective tree width. Finally, we provide empirical evidence that a Beta approximation works much better than the normal distribution, especially for small sample sizes, and that our algorithm works effectively in practice, over a range of belief net structures, sample sizes and queries. This is joint work with Tim Van Allen, Ajit Singh and Peter Hooper. |
May 11 A Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers |
Russell Greiner |
Bayesian belief nets (BNs) are
often used for classification tasks --- typically to return the most likely class label for each specified instance. Many BN-learners, however, attempt to find the BN that maximizes a different objective function --- viz., likelihood, rather than classification accuracy --- typically by first learning an appropriate graphical structure, then finding the maximal likelihood parameters for that structure. As these parameters may not maximize the classification accuracy, ``discriminative learners'' follow the alternative approach of seeking the parameters that maximize *conditional likelihood* (CL), over the distribution of instances the BN will have to classify. This presentation first formally specifies this task, and shows how it extends standard logistic regression. After analyzing its inherent sample and computational complexity, we present a general algorithm for this task, ELR, that applies to arbitrary BN structures and works effectively even when given incomplete training data. We present empirical evidence that ELR produces better classifiers than are produced by the standard ``generative'' algorithms in a variety of situations, especially in common situations where the given BN-structure is incorrect. This is joint work with Wei Zhou, Xiaoyuan Su and Bin Shen See http://www.cs.ualberta.ca/~greiner/ELR. |
May 18 Structure learning (2) |
Ricardo Silva, Anna Goldenberg,
Fan Li |
Scheines, R, (1997) "An Introduction to Causal Inference", in
Causality in Crisis, ed. by Steven Turner and Vaughan McKim, University of Notre Dame Press. Available at http://www.phil.cmu.edu/faculty/spirtes/tetradpapers/notredame.ps (a nice overview of representational issues that are very relevant for structure learning) (again) Friedman, N. (1997). Learning belief networks in the presence of missing values and hidden variables.. In Fourteenth Inter. Conf. on Machine Learning (ICML). 1997. Friedman, N. and Koller, D. (2003). Being Bayesian about Network Structure: A Bayesian Approach to Structure Discovery in Bayesian Networks. Machine Learning, 50:95-126, 2003. PostScript, PDF. Elidan, G. Discovering hidden variables: A structure Based-Approach with Noam Lotner, Nir Friedman and Daphne Koller. Proceeding of the Neural Information Processing Systems conference (NIPS), 2000. Silva, R.; Scheines, R.; Glymour, C. and Spirtes P. (2003) "Learning measurement models for unobserved variables". Proceedings of the 19th Conference on Uncertainty on Artificial Intelligence. N. Friedman, D. Pe'er, and I. Nachman Learning Bayesian Network Structure from Massive Datasets: The ``Sparse Candidate'' Algorithm. N. Friedman, D. Pe'er, and I. Nachman UAI 15, 1999. http://www.cs.huji.ac.il/~nir/Abstracts/FPN1.html A Moore, Weng-Keen Wong Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning, ICML 2003 A. Goldenberg and A. Moore, Tractable Learning of Large Bayes Net Structures from Sparse Data, ICML 2004 |
causality (do we want to cover this?) They cover the bulk of causality, which is estimating causal effects from a structure given in advance. |
David Edwards (2000): "Causal
Inference", this is Chapter 8 (pp. 219-243) of his book "Introduction to
Graphical Modelling" (Springer, 2nd ed). Phil Dawid (2000): Causal inference without counterfactuals. J. Amer. Statist. Ass. 95 (2000), 407-448. An earlier version is available for download at http://www.homepages.ucl.ac.uk/~ucak06d/reports.html number 188 (year 1997) J. Pearl, "Statistics and Causal Inference: A Review" In Test Journal, Vol. 12(2), pp. 281-345, December 2003 (with discussions). Available at ftp://ftp.cs.ucla.edu/pub/stat_ser/Test_pea-final.pdf J. Pearl, ``Simpson's paradox: An anatomy'' Extracted from Chapter 6 of CAUSALITY. Available at http://bayes.cs.ucla.edu/R264.pdf |
|
parameter learning, active learning, dynamic bayes nets, prob. relational models, other types of graphs, feature selection for maxent models, (suggestions welcome) |
|
|
Kernel Conditional Random Fields |
Jerry Zhu |
Kernel
Conditional Random Fields: Representation, Clique Selection, and
Semi-Supervised Learning. John Lafferty, Yan Liu, Xiaojin Zhu CMU tech report CMU-CS-04-115 |
Monte Carlo | Introduction
to Monte Carlo Methods David MacKay Probabilistic Inference Using Markov Chain Monte Carlo Methods Radford Neal |
Zoubin's 2003 unsupervised learning course website http://www.gatsby.ucl.ac.uk/~zoubin/course03/index.html
Lise's GM reading group http://glue.umd.edu/~acardena/graphmod/
Rutgers GM course: http://www.cs.rutgers.edu/~vladimir/class/cs500gm.html
Learning in Graphical Models (eds, Jordan): http://www.dai.ed.ac.uk/homes/felixa/jordancol.html
Kevin Murphy's reading list: http://www.ai.mit.edu/~murphyk/Bayes/bnintro.html
Luo Si (lsi)
Jian Zhang (jian.zhang)
Jerry Zhu (zhuxj)