We’re taking a small digression to answer a question put by Deborah Mayo in Part I, pointing to this article on her site. Mayo should be on everybody’s list because she has good critiques of orthodox Bayesian statistics (which I don’t follow; we’re logical probabilists here), and because she has many named persons in statistics who comment on her articles. The material below is worth struggling through to see the kinds of arguments which exist over foundations.
Loosely quoting Mayo, a hypothesis (proposition) h is confirmed by x (another proposition) if where d is any other proposition (this will make sense in the example to come). The proposition is disconfirmed if $latex \Pr(h|xd) < \Pr(h|d)$. If $latex \Pr(h|xd) = \Pr(h|d)$ then x is irrelevant to h. Lastly, h' means "h is false," "not h," or the complement of h. Mayo (I change her notation ever-so-slightly) says "a hypothesis h can be confirmed by x, while h' disconfirmed by x, and yet $latex \Pr(h|xd) < P(h'|dx)$. In other words, we can have $latex \Pr(h|xd) > \Pr(h|d)$ and $latex \Pr(h’|xd) < \Pr(h'|d)$ and yet $latex \Pr(h|xd) < \Pr(h'|xd).$" In support of this contention, she gives an example due to Popper (again changing the notation) about dice throws. First let d = "a six-sided object which will be tossed and only one side can show and with sides labeled 1, 2, ...", i.e. the standard evidence we have about dice.
Consider the next toss with a homogeneous die.
h: 6 will turn up
h’: 6 will not turn up
x: an even number will turn up.
The probability of h is raised by information x, while h’ is undermined by x. (It’s probability goes from 5/6 to 4/6.) If we identify probability with degree of confirmation, x confirms h and disconfirms h’ (i.e., $latex \Pr(h|xd) > \Pr(h|d) and \Pr(h’|xd) < \Pr(h'))$. Yet because $latex \Pr(h|xd) < \Pr(h'|xd)$, h is less well confirmed given x than is h'. (This happens because $latex \Pr(h|d)$ is sufficiently low.) So $latex \Pr(h|xd)$ cannot just be identified with the degree of confirmation that x affords h.
I don’t agree with Popper (as usual). Because $latex \Pr(h|d) = 1/6 < \Pr(h|xd) = 2/6$ and $latex \Pr(h'|d) = 5/6 > \Pr(h’|xd) = 4/6$. In other words, we started believing in h to the tune of 1/6, but after assuming (or being told) x, then h becomes twice as likely. And we start by believing h’ to the tune of 5/6, but after assuming x, this decreases to 4/6, or 20% lower. Yes, it is still true that h’ given x and d is more likely than h, but so what? We just said (in x) that we saw a 2 or 4 or 6: h’ is two of these and h is only one.
“Does x (in the presence of d) confirm h?” is a separate question from “Which (in the presence of x and d) is the more likely, h or h’?” The addition of x to d “confirms” h in the sense that h, given the new information, is now more likely.
No problems so far, n’est-ce pas? And Mayo recognizes this in quoting Carnap who noted “to confirm” is ambiguous. It can mean (these are my words) “increases the probability of” or it might mean “making it more likely than any other.” Well, whichever. Neither is a difficulty for probability, which flows perfectly along its course. The problems here are the ambiguities of language and labels, not with logic.
No real disagreements yet. Enter the so-called “paradox of irrelevant conjunctions.” Idea is if x “confirms” h, then x should also “confirm” hp, where p is some other proposition (hp reads “h & p”). There are limits: if p = h’, then hp is always false, no matter which x you pick. Ignore these. As before we can say p is irrelevant to x if . Continuing the example, let p = “My hat is a fedora”; then and so is .
The next step in the “paradox” is to note that if x “confirms” h in the first sense above, then . In our example, this is 1/(1/3) which is indeed greater than 1. So we’re okay. Now we assume p is irrelevant, so . Divide this by , then because so too does . Ho hum so far; just some manipulation of symbols.
Then it is claimed that x, since it “confirmed” h, must also “confirm” hp. Well, this is so. Then Mayo says (still with my notation):
(2) Entailment condition: If x confirms T, and T entails p, then x confirms p.
In particular, if x confirms (hp), then x confirms p.
(3) From (1) and (2), if x confirms h, then x confirms p for any irrelevant p consistent with H.
(Assume neither h nor p have probabilities 0 or 1).
It follows that if x confirms any h, then x confirms any p.
That’s the “paradox.” I don’t buy it. Like most (all?) paradoxes, there was a trip up in evidence along the way.
In our example, in (2), h does not entail p, but hp does entail p. What does entail mean? Well, . The paradox says x confirms p just because hp entails p. Not a chance.
What’s happened here is the conditioning information, which is absolutely required to compute any probability, got lost in the words. We went from “x and hp” to “x and p”, which is a mistake. Here’s the proof.
If x confirms h, then (using the weaker sense of “confirmed”). Because p is irrelevant to h and x, then and and . But if p is confirmed by x, then it must be that . But doesn’t exist: it has no probability. Neither does exist.1 What does wearing a hat or not have to do with dice? Nothing. You can’t get there from here. This is a consequence of p’s irrelevancy.
So p can’t be confirmed by x in the usual way. What if we add h to the mix, insisting ? Not much, because again neither of those probabilities exist. You can’t have inequalities with non-existent quantities. And when we “tack on” irrelevant p, we’re always asking questions about or and not or .
Result? No paradox, only some confusion over the words. Probability as logic remains unscathed. If anybody thinks the paradox remains, she should try her hand at stating the paradox purely using the probability symbols and not the mixture of words and symbols. The exercise will be instructive.
See the necessary comment by Jonathan D and my reply. Looks like JD found the mistake actually starts earlier in the problem.
1Thinking every probability has a unique number is a mistake subjectivists make. They’ll say “Well I believe ” or whatever, but what they really have done is inserted information and withheld it from the formula, i.e. when they make statements like that they’re really saying for some mysterious q that forms their belief. Given q that probability might even be right, but just is not . Still no paradox.