It is often difficult to keep in mind all the links in a chain of argument when that chain is long.1 This is especially so when one expects that chain to lead to a familiar place, and instead it veers in an unexpected direction. One sits “at home” waiting for the argument to reach them, but it never does. The (incorrect) conclusion is that the chain is faulty.
The argument I gave these past two weeks lands in an entirely different spot from that offered by the classical school of statistics. Not a new spot, just a different one. The theory I have been calling “predictive” is actually old. Philosophically old, not mathematically. The philosophy is positivism or logical probability (see below about “positivism”).
This view of probability is the one given by John Maynard Keyenes (A Treatise on Probability), Harold Jeffereys (Theory of Probability), Rudolph Carnap (Logical Foundations of Probability), Edwin Jaynes (Probability Theory: The Logic of Science), Bruno de Finetti (Theory of Probability), Seymour Geisser (Predictive Inference: An Introduction), David Stove (The Rationality of Induction), and other similar authors. And don’t forget Pierre-Simon, Marquis de Laplace (A Philosophical Essay On Probabilities ).
Few of these books, or none, is ever read or taught in ordinary university courses in probability and statistics, so it is not surprising they are unfamiliar to most statisticians. Certainly these authors never show up in undergraduate classes. And when they appear in graduate courses—a rare thing—it is the mathematics which is emphasized, not the philosophy.
I have tried to be careful to show that, even if the classical frequentist theory is accepted, the language used to describe results is often wrong or exaggerated. Few recall the precise definitions of p-values and confidence intervals, for example. Even text books are schizophrenic: different interpretations of these creatures often appear even within individual books! This is partly because the definitions of these are tangled, long, and non-intuitive.
But even if the language is proper, the results themselves—for which, I hasten to add, the mathematics are flawless—are not what anybody wants; worse, the results produce over-certainty. If presented with the choice of two drugs, what you want to know is the chance that one is better than another. You are not interested in the probability of seeing a larger value of an obscure test statistic assuming that you repeat some trial an infinite number of times and assuming the truth of some probability model, given that the parameters of which are set equal or set equal to zero (this is, of course, the p-value).
What exacerbates the disparity is that the evidence given by the p-value is guaranteed to be exaggerated with respect to the chance that one drug is better than another (this is a provable, and well proved, mathematical claim). The p-value can be as small as you like (“highly statistically significant”), but that does not imply that the chance that one drug is better than another is high.
The situation is improved when one moves to considering posterior distributions of parameters of models. But not by much. Language used to describe posterior distributions is certainly more intuitive and memorable than that used in hypothesis testing, but it is still the case that knowledge of the value of a parameter (given some data and accepting the truth of some model) still exaggerates the evidence that the chance that one drug is better than another. Once again, one can have almost certain knowledge of the value of some parameter or parameters (as in estimation or hypothesis testing), but this does not translate into high probability that one drug is better than another.
The situation is improved once more by moving to predictive statistics. Improved, not solved—for there is no solution. Uncertainty in contingent events and theories will always be with us. An event or theory being contingent makes this so. But at least we can say the chance that one drug is better than another is X%—given we accept the truth of some model and given our observations.
Hypothesis testing and Bayesian parameter estimation cannot give direct evidence whether the models we have been assuming true really are. This too is a provable, and well proved, mathematical and philosophical statement. Predictive methods can and do give us this evidence. Further, this evidence is expressed in natural language in statements of probabilities of observables and in terms of decisions we make with these observables.
I’ll show more examples as time goes on. But for those who want a head start, you cannot go wrong by reading the books given above.
Update When I say positivism, I not mean the Vienna Circle logical positivism. I do mean positivistic. See comments below.
1Another difficulty is when that chain is offered in 800-word, non-contiguous chunks. And when those words are unclear, the situation is not made any easier. But I’m working on fixing that.