A paradox is a mistake in thinking; an artificial, human creation which usually arises because a conclusion which follows from a set of beloved premises is itself unloved.
Twitter user @alpheccar asked me to look at an example of the so-called Jeffreys-Lindley paradox, which was lately discussed by physicist Tommaso Dorigo at the Science 2.0 website. It is best to go there to read the complete details of his imagined experiment, but I’ll summarize it here. (I change his notation slightly.)
A particle counter collects n = 1,000,000 instances of some quantum mechanical event, of which n+ = 498,800 were “positive” and n- = 501,200 were “negative.” The details aren’t especially interesting to us, except to note that the theory which informs these counts suggests that equal numbers of positive and negative hits should be even.
Suppose you fire up the machine and run n = 1 instance of the event. Will n+ = n-; i.e., will the counts be equal? Obviously not: it is impossible. Is the theory (which we can call T) that said that n+ = n- then wrong? The answer, which may be counter-intuitive is, yes it is. T is false.
That is, if T says, “In any experiment, n+ = n-.” We ran an experiment, n+ did not equal n-, therefore we have falsified T. Not fair? Well, the word any does mean any: there is no escape. Suppose instead that T actually means, “In any experiment where n is divisible by 2 , n+ = n-.” More of a fair playing field.
Okay fire up the detector: n = 2 and (say) n+ = 2, n- = 0. Is T true or false? False again and for the same reason, n+ does not equal n-.
But wait a second. We’re talking about quantum mechanical events, here. The realm of the truly uncertain. T allows more wriggle room; it is not as demanding as we have been suggesting. What T really mean is this: “The probability of n+ is 1/2.” From T we can infer that the probability of n- is also 1/2. So now if we see n = 2 and n+ = 2, n = 0, we are no longer sure that T is true or false, because given T these kinds of results can happen.
In fact, no matter what n is, if in any experiment we see n+ = n, n- = 0—or we see any other value of n+, n- —we cannot say that T is false because T says that any sequence of n+, n- can happen. As long as n is less than infinity, which it always will be (I mean always in the sense of always), no observations can prove T wrong.
There is only one other thing we can do. We can, for fun, calculate the probability, given T and a fixed n, of seeing n+, n- equaling their observed values. Since this is simple, I leave it as a homework assignment. And that, without adding more information to T, or providing alternatives to it, is all we can say (I mean all in the sense of all).
Enter the “paradox”
Dorigo imagines a frequentist statistician seeing n+ = 498,800 and n- = 501,200. That frequentist, in order to simplify life, calculates x = n+/n- and s2 = x*(1-x)/n, and then plugs these values into a normal distribution as the central and spread parameters, i.e. he forms N(x, s). The frequentist also accepts that T is true. This lets him calculate the probability that x < 0.4988 (the observed value) given T is true and that this normal approximation is okay and given the plug-in values for the parameters are uncertainty-free. This calculation gives p = 0.0082.
The normal approximation isn’t really necessary; we can easily do the actual binomial calculation (In R this is pbinom(498800, 1e6, 0.5) ) and it gives the same answer. So skip worrying about the approximation. Worry instead about what this number means. It is, assuming T is true, the probability of seeing n+ = 498,800 or fewer hits in an experiment with n = 1,000,000 runs. Okay so far?
Now Dorigo imagines a (possibly inebriated) Bayesian thinking to himself that T might be true, as he was told it might be by the physicist. The Bayesian says to himself that “I might as well suppose that the probability that T is true is 1/2, which means the probability T is not true is also 1/2. Now T says that the probability of n+ is 1/2. But an alternative to T, call it T’, might say that the probability of n+ = 1/4. Still another alternative, T”, might say the probability of n+ = 2/3, and so on for all the other alternatives.”
How many alternatives to T are there? Uncountably many. Every number between 0 and 1 (excepting 1/2) is a potential alternative. The Bayesian doesn’t know which of these uncountably many alternatives is more likely true than another and so decides to give them all the same probability: the actual amount is that 1/2 left over and spread out after assuming the probability T is true is 1/2.
Now, through the miracle of Bayes’s theorem, the Bayesian can calculate the probability T is true given the observed values of n+, n- and the assumption that T and its alternatives each had those certain chances to be true. This probability, again using the normal approximation, is q = 0.978.
Whoa! The frequentist “rejects” his “null” hypothesis that T is true, but the Bayesian says it’s all but certain that T is true. A paradox!
Both the frequentist and the Bayesian are out of their minds. Both have produced numbers which are completely useless and answer no real-life questions.
If T is true, the probability that n+ < 498,800 is of no interest to anybody. Remember, if T is true, any value of n+ which is less than or equal to n is possible, so just because we see one of these values means nothing.
And what was that Bayesian drinking? Why is it that he thought the probability of T being true was 1/2 and that every other value was equally likely? Too in love with the math (or bottle), we can suppose. We’d have to verify this with the physicist (who knows the quantum mechanical niceties of apparatus), but it strikes me as a nutty assumption. Since the Bayesian started with something absurd, his result is nothing more than a curiosity.
The only paradox here is why people trust us statisticians as much as they do.
What to do?
I’m going to disappoint you, but in so doing also answer the implicit question I just posed, by saying that the real answer is not easy, and perhaps not even quantifiable. In other words, people trust us because we provide these pretty little, and comforting, quantifications.
If our question is to ask, “What is the probability T is true?” we must provide alternatives or we must just accept that T is true. This is the answer—there is no other. By no other I mean no other.
Specifically, calculating the probability n+ < 498,900 given T is true is not calculating the probability T is true. It assumes T is true. And anyway, who cares about the probability of values we didn’t see? Values such as n+ = 498,899, n+ = 498,898, and so on.
The Bayesian has the better idea—he does calculate the probability T is true given the observations—but his implementation was bizarre. He put strange a priori probabilities on alternatives that are (probably, we’d have to check) physically nonsensical.
What we need are realistic alternatives. What are they? I certainly don’t know; at least, not for this experiment. The physicist might be able to provide these, and even to say (given some theory) what the probability each of the alternatives is true. If he can, then the Bayesian can work his magic and incorporate them into Bayes’s formula and produce a quantification that, given the evidence of the alternatives and the experimental data, the probability T is true.
If the physicist can’t quantify these alternatives—and chances are he can’t, since there are too many ways for an experiment to go wrong—he would be better going by his gut. Are there any cables loose? Somebody forget to divide by 2? Probably T + E is true (T plus some measurement error). Or perhaps the theory that gave rise to T needs to be altered? T really is false, but something like T is true. Do changes to this theory give us models (different T’, T”) which better predict the observed data? This is all hard work—unavoidable hard work.
Conclusion: there is no paradox, only unrealistic assumptions. For a terrific discussion of this, and for more reasons why the answer is more complicated than we would like, go to Jaynes and read his “Queer uses of probability theory.”
Update Although I go over a specific example here, it is paradigmatic: the paradox just isn’t one.