Let us now return to the problem of computing the posterior probabilities in the QMR model. Recall that it is the conditional probabilities corresponding to the positive findings that need to be simplified. To this end, we write
where . Consider the exponent
. For noisy-OR, as well as for many other
conditional models involving compact representations (e.g., logistic
regression), the exponent f(x) is a concave function of x. Based on
the discussion in the previous section, we know that there must exist
a variational upper bound for this function that is linear in x:
Using Eq. (9) to evaluate the conjugate function
for noisy-OR, we obtain:
The desired bound is obtained by substituting into
Eq. (13) (and recalling the definition
):
Note that the ``variational evidence'' is the
exponential of a term that is linear in the disease vector d.
Just as with the negative findings, this implies that the
variational evidence can be incorporated into the posterior in
time linear in the number of diseases associated with the finding.
There is also a graphical way to understand the effect of the transformation. We rewrite the variational evidence as follows:
Note that the first term is a constant, and note moreover that the
product is factorized across the diseases. Each of the latter factors
can be multiplied with the pre-existing prior on the corresponding
disease (possibly itself modulated by factors from the negative evidence).
The constant term can be viewed as associated with a delinked finding
node . Indeed, the effect of the variational transformation is
to delink the finding node
from the graph, altering the priors
of the disease nodes that are connected to that finding node. This
graphical perspective will be important for the presentation of our
variational algorithm--we will be able to view variational transformations
as simplifying the graph until a point at which exact methods can be
run.
We now turn to the lower bounds on the conditional probabilities
. The exponent
in
the exponential representation is of the form to which we applied
Jensen's inequality in the previous section. Indeed, since f is
concave we need only identify the non-negative variables
,
which in this case are
, and the constant a,
which is now
. Applying the bound in
Eq. (12) we have:
where we have allowed a different variational distribution
for each finding. Note that once again the bound is linear in
the exponent. As in the case of the upper bound, this
implies that the variational evidence can be incorporated into
the posterior distribution in time linear in the number of diseases.
Moreover, we can once again view the variational transformation
in terms of delinking the finding node
from the graph.