The over-production of wee p-values led to the downfall of Cornell Professor Brian Wansink, who is being made to retire (with what we can guess is a comfortable “package”).
We met Wansink before, in the context of cheating with statistics. According to Vox:
His studies, cited more than 20,000 times, are about how our environment shapes how we think about food, and what we end up consuming. He’s one of the reasons Big Food companies started offering smaller snack packaging, in 100 calorie portions. He once led the USDA committee on dietary guidelines and influenced public policy. He helped Google and the US Army implement programs to encourage healthy eating.
Ah, the love of theory. Science did so well with the simple things, like explaining (in part) gravity at the largest scales, why can’t it do well explaining small things, like what’s best to eat? Surely we can’t go by the wisdom of ages, since that’s anecdote and not blessed “randomized controlled” experiment.
Never mind all that.
Thirteen of Wansink’s studies have now been retracted, including the six pulled from JAMA Wednesday. Among them: studies suggesting people who grocery shop hungry buy more calories; that preordering lunch can help you choose healthier food; and that serving people out of large bowls encourage them to serve themselves larger portions…
There was also Wansink’s famous “bottomless bowls” study, which concluded that people will mindlessly guzzle down soup as long as their bowls are automatically refilled, and his “bad popcorn” study, which demonstrated that we’ll gobble up stale and unpalatable food when it’s presented to us in huge quantities.
Why these were even subjects of “research” is, I think, the more important question. But such is the grip of scientism that it probably won’t even strike you as odd we laid aside the knowledge of gluttony in the search of quantifying the unquantifiable.
I’m happy, however, to note that Vox sees parts of the problem:
Among the biggest problems in science that the Wansink debacle exemplifies is the “publish or perish” mentality.
To be more competitive for grants, scientists have to publish their research in respected scientific journals. For their work to be accepted by these journals, they need positive (i.e., statistically significant) results.
That puts pressure on labs like Wansink’s to do wha’’s known as p-hacking. The “p” stands for p-values, a measure of statistical significance. Typically, researchers hope their results yield a p-value of less than .05 — the cutoff beyond which they can call their results significant.
There is no non-fallacious use of p-values. But that doesn’t mean that they can’t lead to true results. Here is a faulty syllogism: “Michael Moore is a fine fellow. Fine fellows support toxic feminism. Therefore, 1 + 1 = 2.” Although it’s a joke, p-values are like this in form. The conclusion in this argument is true, but not for the reasons cited.
That’s the best case. The usual one is like this: “I need a novel finding to boost my career, and here is some data. Here is a wee p-values. Therefore, men really aren’t taller than women on average.” The conclusion is absurd, goes against common sense, but since it has been “scientifically proved”, and its anyway politically desirable, it is believed.
When I am emperor I will ban all p-values, and I will also disallow the publication of more than one paper biannually. (These are among my lighter measures. Wait until you hear what I intend to do with Resurrection deniers.)
I won’t go through Vox’s incorrect explanation of p-values. You can read all about them here, or soon in a paper that has been accepted (peer reviewed, and therefore flawless) “Everything Wrong With P-values Under One Roof.” Also see the articles here.
I want instead to disabuse Vox of their explanation of Bayesian statistics:
While p-values ask, “How rare are these numbers?” a Bayesian approach asks, “What’s the probability my hypothesis is the best explanation for the results we’ve found?”
Not really. What Bayesians ask instead is “What’s the posterior probability this parameter in my ad hoc model does not equal zero?” You can call a non-zero parameter in an ad hoc model a hypothesis if you like—it’s a free country—but a non-zero parameter is such a narrow small thing that it’s undeserving of such a noble title.
That’s the problem with (most) Bayesian statistics. It still hasn’t learned to speak of reality. For that, see this class.
Thanks to Sheri, Al Perrella, and some anonymous readers for the tip.