Regular readers will know the arguments and evidence proving P-values should no longer be used for any reason.
If you still doubt, which I doubt, read this paper. Every decision of interest made with a P-value is the result of a fallacy or a mistake.
There is a move afoot to end the tyranny of statistical “significance” and hypothesis testing. It’s not only Yours Truly shouting in the darkness, but a large and growing movement. P-eep at this:
Please retweet: Already 420 people have signed the forthcoming Nature comment 'Retire statistical significance'! The new deadline to send an E-mail to email@example.com and add your name to the list is Friday 8 March. Thanks a lot!
— Valentin Amrhein (@vamrhein) March 7, 2019
It is obviously time to move past P-values. Obvious to statisticians, that is. Not necessarily to non-statisticians:
JAMA rejected this letter from my colleagues & me ("low priority"), so we're publishing on twitter, hoping JAMA will take it more seriously. pic.twitter.com/bh7byo4CrR
— Ken Rothman (@ken_rothman) May 9, 2017
Sad story, no? Though one more common than bedbugs. The P-value was not wee, and so there was weeping and the gnashing of teeth. Worse, there was the false conclusion that “no difference” was found because the P wasn’t wee.
This kind of thing happens so often there ought to be a name for it. P-envy, perhaps. Or Wee-P envy.
All statisticians know that disappointment among clients is palpable when the Ps aren’t wee. We (some of us, anyway) try to explain the weaknesses of P-values, about how “significance” has no bearing in real life, how P-values mislead, how useless they are, how there are better methods. Some of us provide the better methods—the predictive methods. And these often show interesting things, even (or even often) with non-wee Ps.
But no good. If the P isn’t wee, we are asked often to recut, reanalyze, redo the analysis to find the hidden wee P that is surely there. Of course, this does work. Wee Ps lurk everywhere, and all it takes is time to find them. Judging by all the weak papers we see flooding the scienceosphere, they are found.
P-values work like magic. People—not you, dear readers, no, not you—really do think that when a wee P has been found, a real cause has been discovered, a real theory has been proved. Arguments proving these ideas false are not heard, not heeded.
That deafness will be the problem.
As the tweet above says, there will soon be news of several hundred statisticians and scientists moving in a very public way against P-values. The computer science crowd has already beat us, it’s true; but better now than never. Things cannot go on like in the past. Hence the call to make the change.
What will be the result? The Revenge of Inertia.
The biggest impediment will not be other statisticians, who in reasonable order will follow the proofs. It will be users of statistics. It will be editors and referees of medical and sociology journals and the like. All academic statisticians have experienced being lectured by non-statistician journal referees on things like “You really should use this test and not that; they give better P-values.”
The fault, as have I said over and again, is ours. We taught people to use hypothesis testing. We winked at the caveats and weaknesses, ignored the philosophy, and then plunged ahead to the more fun math. Math is easy to test, and math is true. That doesn’t mean, of course, that the math is applicable to real questions, but never mind. The math is what counted (I hope this hilarious pun does not go unnoticed).
P-values, since they offer an easy way out of the heavy labor of hard thought, won’t be given up without a fight. Users will say, and say truthfully, “P-values have worked for us!” They do work, all right. They make decisions for users, decisions which are made far too easily. Yet it’s also so that sometimes P-values make the right decisions.
I give a sketch in the paper (linked above) why this is so, but it’s usually when smart people set up good experiments based on plenty of foreknowledge. In these cases, the P-value can be a sort of proxy for the predictive probability. This is not a good reason to use P-values, though, when you can get the more productive and non-fallacious predictive probabilities instead.
There are alternatives that do not suffer from the unfixable flaws of P-values. Such as this. Software even exists!
No more P-values, please.
I always thought p values were non-science areas trying to become scientific. It was trying to make things like nutrition, psychology, etc look like “hard” science when they were not and never will be. I guess it was effective. It made millions for the personal injury weasels that I see ads for all day long. Ruined science, but who cares? It’s all about money and it probably always will be. I hope there is a change, but I’m not holding my breathe while waiting to see.
Well, what do you expect from doctors? Remember, they study biostatistics, which is to statistics as astrology is to astronomy. Doctors think you can prove causality with statistics. I don’t know how many medical studies I have read that used the term “causal association” in the study. Causal association is an oxymoron, like jumbo shrimp.
Perhaps non-scienctific areas like agronomy and plant/animal breeding? No science there, I guess. Voodoo particle physicists don’t look at signal/noise ratios and compare those ratios to the background (expected values), either.
Re biostatistics (aka rebranded biometry)- see above. Causal knowledge comes from outside the statistics. I think Matt has mentioned that. The statistics are just used organize the data according to the outside knowledge. If you lack the background to read the paper, then yes, it seems silly. When I try to read medical literature, I usually try to find a friendly M.D. to explain the medical background to me.
Re: ‘in these cases a low P-value may be sort of proxy for predictive probability’ at the very end.
Not sure what all that might or might not mean … in many models a low P indicates that spending more money to collect data to develop the model further is probably a waste of time unless some parameters are changed — this is not the same as suggesting the model is any good, only that more of the same data will have a rapidly declining marginal benefit, if any added benefit.
That’s a sign to start looking for where the model’s utility resides, and not to ‘declare victory’ and conclude the model is any good.
When statistics abandons P-values statisticians will try to get away with it!
“NO! not that, this!”
“it was them, not us…it’s all their fault.”
Placebo affects everybody. P values have a placebo effect all of their own! On statisticians salaries. “They’re all doing it why shouldn’t I.”
Trial and error is really how it all works in medical and surgical practice.
Statistics is there to hold up the rear!
When treatment gets serious, it’s still carried out on terminal or ‘nothing to lose’ cases, by consent of the patient that is the live experiment.
That’s the real cutting edge.
Why do people need a statistician to tell them what happened?
“Wow, did that happen by accident? By chance?”
Because statisticians told them they were needed. It can’t have come about the other way around.
Whatever works fastest or with least invasion to a patient is the right approach.
Statistics doesn’t help except in counting which approach won the race.
In pain science there is work now being done to try to show the numbers and turn the discipline into something mathematical. All for the right reasons no doubt. All that happens is people show what they already knew. Why? Because they know people. When what they knew was wrong, nothing changes until someone comes up with a better idea or a new one, (notices a new and interesting observation.)
I would suggest that where statistics went wrong was in the individuals working in the field who actually thought they were cleverer than the other guys working with patients.
Hubris…now it’s not even their fault! Such never ending arrogance.
I see only one or two examples around who can even talk properly about the subject, let alone explain to medics why what they taught them was wrong.
If they’re supposed to be all knowing about that, then why the need for a statistician?
They should be vetted and ratified for safe use near a clinical environment.
My ex fiancé used to read our journal for a laugh. He was a software engineer. It doesn’t take a statistician to know something’s BS. If someone wanted to write a several page article about some corner of a ligament, or how to rehab a particular joint of the thumb and think that drawing a graph is going to help then good luck to them.
Statistics has it’s place in other fields.
Just as when the vet told me to put my dog on cage rest for a month I knew it would destroy her and knew from humans what she needed, I took no notice of the vet who had probably read a study that proved it was so.
Caring professions are not scientific matters. Medicine is the same. It all starts with observation.
Lastly, which is where I came in many years ago.
When I discovered that the ‘statistical significance’ phrase was really as vacuous as it sounded, the wonder was why nobody had sorted it out a long, long time ago. Like five minutes after somebody came up with the idea. Something is either significant or not. How can numbers help?
“Causal knowledge comes from outside the statistics. ”
If the doctors know the cause why do they need statistics? Just explain the causal mechanism and etiology.
Yeah, I was being casual about causal. The hypotheses (abductions) about causality come from outside the data at hand. The measurements are not chosen randomly, which makes the univariate true null hypothesis something of a rare thing. In practice there are other variables that need to be adjusted/controlled and the p-value of interest is conditional on those.
“Just explain ….” – you don’t get out much, do you? “argumentum ab auctoritate” – indeed. Statistics is supposed to reduce that.
“If the doctors know the cause why do they need statistics? Just explain the causal mechanism and etiology.”
Yes. Just say! That’s absolutely right. Try it though…
The ones insisting isn’t tough enough science are partly the ones not thinking right.
The human body is by its nature not like a machine or a system that can be treated like one, in practice. ‘We’ all start off thinking it works like that. Just biomechanics all the way.
Some, not saying which side, think it makes a thing look properly considered.
Medics trust you people too much. Maybe they don’t know what it is they need the statistics for. It’s a two way, contagion.
Particularly when they’re always being told they are so dumb. Having chosen a dumb, pseudoscientific profession. I’ve noticed such claims about groups who choose certain professions.
Isn’t the duty of care with the person being asked for their services and who knows it can’t help?
If a patient didn’t need my help I always said so. Even if lawyers told them they did. It often caused a disagreement until they understood why. That’s the difference between good and bad practitioners of anything.
When explaining the mechanism of a thing like pain in the human body it is always a matter of opinion. Critical thinking and objectivity are supposed to be upmost in your mind whilst trying not to look or sound surprised, take in the most important evidence, which is the subjective, in my view, really.
Knowing a bone is broken is the easy part.
Someone really well known said,
“this is really good data” which I think is an illegal thing to say. ‘Dat is just the data’.
He says a few things which sound suspect but yet I agree with his conclusions. Maybe people are right for the wrong reasons.
Someone gets better because you smile and listen. Then explain.
You don’t know if something happened for the better because of your input. Nature doesn’t work that way. The more you understand the less you think you know of the real mechanics.
If the doctors know the cause why do they need statistics? Just explain the causal mechanism and etiology.
It’s more fun with statistics?
Nobody has done a statistical study to determine if being shot in the head is bad. Why? Because the answer is obvious. Besides, if something is the cause it should have the same result at least most of the time (jumping out of a flying aircraft causes severe injuries unless you are wearing a parachute, e.g.).
Statistics is used to find iffy causes — things that maybe cause bad things some times. As in, X causes cancer less than 1% if the time. Is X really the cause? Gotta wonder.
You still need statistics to answer if C is really the cause — even when it seems obvious. Being shot in the head is obviously bad but then statistically it is bad most of the time. It’s obvious because it is statistically so.
There is a statistics term for that. It’s called the IOT test: “Inter-Ocular Trauma “, as in: it hits you between the eyes.