Here we look at two theories of decision and inference that are closely related to Quasi-Bayesian theory: lower expectations and lower previsions. The main message is that, as far as mathematical structure is concerned, the three of them refer to the same thing.
Suppose you have a loss function .
Now you pick a probability distribution
and you calculate the
expected loss using that probability distribution:
Suppose you repeat this for all probability distributions in a set of
distributions. Then you will have a value of E[l] for each
probability distribution in the set.
Suppose the set of distributions is convex.
A little bit of thinking will convince you that in this case you will
produce an interval of values of E[l]. Why? Because if two distributions
are in the credal set, the ``distributions within'' them
are in the set (the set is convex), so an interval of expectations is created.
So for every loss function ,
a convex set of distributions
determines an interval of expected losses. The minimum value of
expected loss is called lower expectation and the
maximum value of expected loss is called upper expectations.
In fact you do not need to store
and
.
We have the following fundamental relation:
Based on the definitions of lower and upper expectations, the following question arises:
Suppose you start with a system of intervals for expected loss. For every loss function you can think of, determine an interval on the real line, and say that interval is the interval of of expected losses. You don't have any particular model of probabilities at this point, just the expected loss intervals. You have created a lower expectation model.
Note: before, we started from a convex set of probability distributions and we calculated a system of intervals. Now we consider the reverse process.
You want to use the lower expectation model to create a set of
probability distributions. You can do that by defining the set:
Now the following question becomes relevant:
If I create a lower expectation and then I obtain K as in the previous expression, is it possible to recover exactly the original lower expectation by creating intervals with K? Are the lower expectation and the set K representing the same thing?
The answer is this:
If the lower expectation is superadditive:
(An operator is superadditive if . Notice that and are superadditive.)
Note this other fundamental fact: if you have a set of probability distributions that creates an affinely superadditive lower expectation, the convex hull of this set of distributions will create the same lower expectation. Convex sets of probability distributions are convenient because they summarize all the information: you cannot generate a non-convex set K by the process above.
The relationship between credal sets and superadditive, positively affine lower expectations is simple: they are different representations of the same thing.
For lower expectation models that are not superadditive positively affine, the situation is more complicated. Suppose you pick one of those lower expectations, , and you generate the set K of all distributions that generate (for all functions l). Now if you take the try to generate a lower expectation from K, it will not necessarily be equal to . But the lower expectation generated by K will be affinely superadditive. And from that point on, there will correspondence between the credal set and the lower expectation generated from it [16].
There is another name for affinely superadditive lower expectations, advocated by Walley [30]: lower previsions. The name prevision has some philosophical connotations because it emphasizes that an expected loss is a subjective ``guess'' about the future. But for any practical purposes lower previsions are exactly equal to affinely superadditive lower expectations.