Abstract
Graphical models such as Bayesian networks and structural equation
models are widely used as representations for causal relations. A
fundamental issue when expressing causality in graphical models is how
to deal with possible hidden common causes. We introduce and evaluate
a new principled way of discovering latent causes of set of observed
variables under the rather common assumption that all observed
variables are continuous linear measures, but never causes, of the
underlying latents. In other words, the observed variables define a
measurement model of the latents and the learning task can be seen as
clustering variables according to their common causes. By carefully
choosing which identifiability constraints to assume, the described
methodology has the theoretical strength of not requiring knowledge
about the number of latents, the distribution of any variable,
independence of latents or even if latents are linearly related among
themselves. An empirical study with simulated data is performed in
order to identify situations where this approach can be robust
against effects of sample variability.
Joint work with Richard Scheines, Peter Spirtes and Clark Glymour |
Charles Rosenberg Last modified: Mon Apr 14 18:07:35 EDT 2003