Contributor William Raynor sets us a task:
Hi Matt, …I’d still like to see if you can write a column about what an empirical p-value really is without mentioning the word probability.
Since Raynor is a contributor, I consider this a consulting gig. It’s a difficult job because the ‘p’ in p-value stands for the-word-that-must-not-be-spoken. But let’s try. I won’t cheat, either, and use euphemisms like I just did.
We’re going to do this in two parts. The first is philosophical, and pertains to p-values as they are most often used in science and research. This should appeal to everybody. The second part takes on the “empirical” part of the request, where sometimes p-values seem to make more sense.
The p-value says something about what did not happen and asks you to believe that what did happen was caused as you believe it caused if the p-value is wee.
Way it works is like this. Guy thinks that the new drug profitol causes an improvement in health. Or he believes carbon dioxide caused temperature to increase. Or he thinks men and women respond differently to a question, the differences being caused by the sex of the people answering.
In any and all of these cases—you’ve seen a million new-research-has-shown headlines—the p-value is used to prove a cause has been found. Which cause? The cause the researcher thought was a cause, and not something else.
That means if the p-value is wee, he’ll say the drug caused the improvement, that carbon dioxide caused the temperature to increase, that sex differences caused the answers on a questionnaire.
Before you get too excited, understand that everybody who uses a p-value knows that p-values can’t prove cause. This is what the math says, and the math is right. P-values have nothing to say about cause. But everybody uses p-values to prove cause anyway. They can’t help themselves. P-values are magic.
This knowledge that a p-value can’t find cause has nothing on the overwhelming desire that it should, though. And it’s only natural in our society that desire wins over truth.
All right. So what is a p-value? It is a mathematical construction that takes for its premise the belief that the cause envisioned by the researcher does not exist. That is, it is assumed in the math that the cause the researcher wants to believe is false or a fiction.
Thus, it wasn’t profitol that caused the improvements, but something else. That is wasn’t carbon dioxide that caused the temperature to increase, but something else. That is wasn’t sex that caused the differences in the questionnaire, but something else.
The something else is nearly always “chance”. What is chance? Nothing. Chance doesn’t exist, as we have seen many times. Chance is not like gravity or electricity. You can’t use chance as you could gravity or electricity to bring about an effect. It can’t be measured. It can’t cause anything. It is a state of mind, relative to a set of beliefs.
Even if you don’t follow (or believe) that, never mind. It’s not important to understand p-values.
What is crucial is that the math behind the p-value assumes the cause the researcher was thinking of is nonexistent, non-operative, not around or weak or anemic to the point of vanishing.
This odd belief in the lack of the desired cause is called the “null hypothesis.” You’ve heard it. “The null is that profitol didn’t cause the improvement” and so on. We are supposing the improvement was observed. We are also supposing that temperature increased—but increase is tricky because of definitions of “trend”. And we are supposing there were observed differences in the scores by sex.
The math needs this null. This null premise is fed into the math, along with the data, and if the p-value is less than the magic number, a value so ubiquitous I don’t need to mention it, then everybody believes the null has been been disproved.
I’ll be blunter. People believe the null is false when the p is wee—unless they really, really want their cause to be true, then the p-value is ignored. Here it’s a good thing our patron has banned a certain word, because it’s utterly inapplicable here. The null isn’t maybe or perhaps false if the p is wee; it is decided it is false, or its decided to be false.
Well, if the null is false, what then? That the improvement, temperature increase and score difference were observed means they must have been caused by something. And the only thing on the researcher’s mind are the causes profitol, carbon dioxide, and sex.
If you accept the premise “Either profitol was the cause, or it was something else”, and you believe the p-value has proved the something else is false, you must believe it was profitol that was the cause.
But this is absurd. This is the p-value fallacy. The p-value calculation says nothing about cause. It assumes the something else is the cause. It is therefore impossible to move from believing profitol was the cause when the p-value says it isn’t, regardless of its value.
Well what if the p is not wee, and the observed changes were ambiguous, does that prove that profitol, carbon dioxide, and sex were not causes?
No. Obviously no. Why? Because the p-values assumes that profitol etc. were not causes! That’s not proof, that’s assumption.
The thing with a non-wee p-value is that profitol could have sometimes caused, or aided in the cause of, an improvement for some patients. Or carbon dioxide would have caused, in conjunction with other things, part of the temperature change. And sex could have caused some differences.
There! The forbidden word was never used but I hope you now have some idea of what a p-value is. To know it truly requires the forbidden word, and some math. You can read a series of proofs—iron-clad unbustable rigorous proofs—why every use of a p-value is a mistake or a fallacy, in the papers linked here. Next week we try and remove the p-value fallacy in those times where the causes in the “null” are known.
To support this site and its wholly independent host using credit card or PayPal (in any amount) click here