Many (thank you everybody!) people sent me the “Odds Are, It’s Wrong” article by Tom Siegfried and have asked me to comment. Below are the key points; I will assume that you have already read Siegfried’s article. And stay tuned, because I’ll be having more of this soon.
The inventor of p-values, R.A. Fisher, was an enormous intellectual fan of Karl Popper. Popper was also a huge fan of himself. Popper arrived at the idea that you could never know whether anything is true through induction.
You could only know if something is false through deduction. He also, somewhat oddly, said things like our degree of belief in the truth of any theory was just as high after we saw any corroborative data as it was before we saw any data.
What really mattered, he said, was whether a theory was falsifiable; that is, that it was contingent and that observations could be imagined (not necessary ever observed) that would prove the theory wrong. Most epistemologists nowadays, and the most famous ones like John Searle, no longer buy falsifiability. But Fisher did.
Sort of. Fisher knew that no probability theory was falsifiable. If a probabilistic theory said, “If theory M is true, X is improbable, with as low a probability as you like, but still greater than zero” than whether or not X happens, M cannot be proved true or false. Specifically, it can’t be falsified. Ever.
But, Fisher reasoned, how about if we made a probability criterion that would indicate that something was “practically” falsified? This criterion can not—it must not!—say whether any theory was true or false, nor could it say a theory was probably true or probably false.
It would, instead say that if the theory was true, how likely or unlikely were the results we saw? A p-value sort of does that, but it does it conditional on the equality of unobservable parameters of probability models.
But never mind that. If you ran an experiment and received a p-value of 0.05 or less, it meant it was publishable. It meant some statistic was improbable, assuming that some “null hypothesis” about parameters was true.
What it did not mean was that your theory was likely true, or that it was likely false. P-values cannot be used in any way to assert probability statements about theories. It is forbidden to do so in classical statistics.
Siegfried’s point is that nobody ever remembers this, and that nearly everybody uses p-values incorrectly.
He is 100% right.
A small p-value is supposed to mean a study was “statistically significant.” But as we have talked about over and over, if you run a study and cannot find a publishable (<0.05) p-value, it only means that you haven't tried hard enough. In other words, statistical significance is nearly meaningless with respect to the truth or falsity of any theory. "Statistical significance" should join the scrap heap of science, whose historical inhabitants N-rays, cold fusion, phlogiston, ectoplasm, and others will welcome it with open arms. Randomization
Coincidentally (?), we talked about this the other day. In my comments, I lazily asked readers to look up some quotes from biostatistician Don Berry. Siegfried has done the work for us.
In an e-mail message, Berry points out that two patients who appear to be alike may respond differently to identical treatments. So statisticians attempt to incorporate patient variability into their mathematical models.
“There may be a googol of patient characteristics and it’s guaranteed that not all of them will be balanced by randomization,” Berry notes.
Note: that’s “googol” the number and not that internet company. A googol is 10100. That seems right to me. Notice, too, that, as I said, Berry said, “it’s guaranteed” that imbalances between groups will exist. He goes on to say that it’s control—whether in modeling or in the experimental design—is what is important.
There is immense confusion about “prior” probabilities, or “prior distributions.” I won’t be able to end that confusion in one hundred words.
What I can tell you is that the central misunderstanding stems from forgetting that probability is conditional. Probability is a measure of information, or logical degree. Just as you cannot say whether or not a conclusion of some logical argument is true or false without first knowing its premises, you cannot know the probability of some argument without knowing its evidence (premises by another name).
Forgetting this simple fact is what leads some people to mistakenly believe probability is subjective. If probability is subjective, critics say, that any prior can equal anything. They’re right. But if probability is conditional, then priors cannot equal anything, and must be fixed, conditional on evidence.
The other difficulty comes in jumping to infinity too quickly. Probability measures on continuous spaces inevitably lead to grief. But more on that another day.