Most forget their parameterized models are ad hoc, and so fret about how to put uncertainty on the parameters. Hence the cheerful reception of De Finetti’s representation theory. WARNING for those reading the email version! The text below might appear to be gibberish. If so, it means the LaTeX did not render in the emails. I’m working on this. Meanwhile, please click on the headline and read the post on the site itself. Thank you.
Video
Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty
HOMEWORK: Given below; see end of lecture.
Lecture
This is an excerpt from Chapter 8 of Uncertainty. In which are four—count ’em!—four typos placed by my enemies, unlike the last two weeks only having one each. This time I fixed them here, but they will still exist in the book.
Suppose (for simplicity) Y is true-false and is to be assessed more than once. Given E, the usual way to say that Y is exchangeable is if
$$\Pr(\mbox{Y}_1\mbox{Y}_2\dots\mbox{Y}_n|\mbox{E}) = \Pr(\mbox{Y}{\pi_1}\mbox{Y}{\pi_2}\dots\mbox{Y}{\pi_n}|\mbox{E})
$$
where ${\pi_i}$ is a permutation of the numbers $1,\dots,n$. The order doesn’t matter. Exchangeability is not quite the same as irrelevance (or independence), which might not be obvious given this way of writing. To see why consider a Polya urn model.
We have an urn with $n$ white and $m$ black balls. We grab one out, note its color, and then toss it and another ball of the same color back into the urn. We then grab out a second ball, and repeat. Given this evidence, the probability of grabbing a first white ball is $n/(n+m)$. Suppose the first ball grabbed is white. Another white ball is tossed into the urn, giving now $n+1$ white balls. Given this updated information, the probability the second ball drawn is white is $(n+1)/(n+m+1)$. But then suppose a black ball had been drawn first. Then the probability the second ball is white is $n/(n+m+1)$. Knowledge of which ball drawn first is relevant to knowing the probability the second ball is white. In other words, the notation above might be incomplete. We really have (with obvious notation)
$$
\Pr(\mbox{first white} | n,m) = \frac{n}{n+m},
$$
then
$$
\Pr(\mbox{second white} | n+1,m) = \frac{n+1}{n+m+1},
$$
or
$$
\Pr(\mbox{second white} | n,m+1) = \frac{n}{n+m+1}.
$$
Now, given our evidence, the probability of the first ball being white is, no matter what, $n/(n+m)$. And the probability the second is white is then (simplifying)
$$\Pr(\mbox{second white} | \mbox{E} ) = \Pr(w_1w_2 \mbox{ or } b_1w_2 | \mbox{E} ) =$$
$$\frac{n}{n+m}\frac{n+1}{n+m+1} + \frac{m}{n+m}\frac{n}{n+m+1} =$$
$$\frac{n(n+1) + mn}{(n+m+1)(n+m)} = \frac{n}{n+m}.$$
The probability of white on the first is the same as on the second, as it is on the third or fourth, etcetera. Intuitively this makes sense, because we’re augmenting the original urn with more-or-less the same proportion of new white and black balls. But this result is a consequence of the evidence, not the definition of exchangeability. Exchangeability would be if (in this case)
$$
\Pr(w_1b_2 |\mbox{E}) = \Pr(b_1w_2 |\mbox{E}),
$$
which is easily seen to be true. Of course, we need not check the cases $w_1w_2$ and $b_1b_2$. And it is easily shown that no matter how long the (finite) string is, the sequence is exchangeable, given this evidence. To emphasize, exchangeability is just as conditional as probability and relevance are.
We have earlier seen that exchangeability is important in understanding how probabilities are assigned or deduced (Chapter 5) for propositions with finite content. And we saw above how parameters can arise by taking finite information to the limit in a well defined manner. We can get to parameters another way, using exchangeability and De Finetti’s representation theory.
Suppose evidence E is such that $\Pr(\mbox{Y} | E ) = p$, where $p$ is non-extreme. Then if Y is exchangeable and part of an infinite sequence it can be shown that
$$\Pr(\mbox{Y}_1\mbox{Y}_2\dots\mbox{Y}_n|\mbox{E}) =\int_0^1 \prod{i=1}^n p^{\mbox{Y}_i}(1-p)^{1-\mbox{Y}_i} dQ(p).
$$
This is an existence proof, not a constructive one as above. All it says is that some distribution $dQ(p)$ exists, we know not what. Bayesians have formed the habit of parameterizing this “prior” distribution on $p$ in an ad hoc way, which is natural since there is no guidance in [this equation]. A typical choice here is to set $dQ(p)=dp$, i.e. a “flat” prior, which gives the same results as above. In a way.
It’s only “in a way” because the theorem isn’t needed if, as we claim, $\Pr(\mbox{Y} | E ) = p$, because, of course, we know $p$ and we can prove exchangeability. Everything we need to know we have deduced from E, as we did above. We only move to infinity as approximation if it helps finite calculations. Bayesians use the theorem when they assume $p$ is unknown but—and additional assumption—that it is fixed number.
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: \$WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.
Discover more from William M. Briggs
Subscribe to get the latest posts sent to your email.