The paper is “Researcher Requests for Inappropriate Analysis and Reporting: A U.S. Survey of Consulting Biostatisticians” by Min Qi Wang, Alice F. Yan, and Ralph V. Katz in Annals of Internal Medicine.
As one headline put it “1 In 4 Statisticians Say They Were Asked To Commit Scientific Fraud.”
That’s the wrong headline, though. It should read “Out of the three of four who chose to answer, one out of four biostatisticians admitted being asked to commit fraud.”
How many biostatisticians committed fraud they do not say. Smart money says at least one. Perhaps there is a way to get a p-value on that?
Anyway, our authors went on line and dangled one hundred bucks minus one in front of some ASA members, got over 500 takers (out of 4,000 asked), of which just under 400 answered the questions. We’ll never know what happened to the statisticians who vanished or to those who never bothered. Perhaps some found the questions too painful? We’ll have to agree that their missing answers don’t count—which is, after all, the standard trick. We might title the maneuver Wish Replaces Data.
Concentrate instead on (of those who answered) the top or “most severe” complaint. Which we’ll highlight.
Falsify the statistical significance (such as the P value) to support a desired result
Golly. But of those that answered—a circumlocution I will now drop, but it’s there; it’s always there—only a few say they were asked to do this. That the item was rated so severe is proof enough that p-values are magic. Or are seen as magic by most. Our refrain hasn’t changed: it’s time for the to go.
If I read the table right, it looks like the most common actual fraud request was “Stress only the significant findings, but underreport nonsignificant ones”, which just over half said happened to them. This has certainly happened to me. Often.
It’s usually subtle. “I notice the graphs cross for large x,” ran one recent request, and indicating increasing uncertainty in some measures for large x, “So can we stop the plot at x=y to show only the significant parts?”
Now this isn’t fraud, per se, and the people that asked are fine folks. They wanted to accentuate the positive and eliminate the negative, that’s all. Scientists, our younger readers may be shocked to hear, are like every other human being.
Or maybe that fits under “Do not show plot because it did not show as strong an effect as you had hoped”, which also happened to about half.
Next was “Report results before data have been cleaned and validated”; also about half. What happens here is usually more subtle. This could be laziness or anxiousness and nothing more.
Many of the other requests for fraud have to do with p-values. “Request to not properly adjust for multiple testing when ‘a priori, originally planned secondary outcomes” are shifted to an ‘a posteriori primary outcome status'”. “Conduct too many post hoc tests, but purposefully do not adjust [alpha] levels to make results look more impressive than they really are.”
That whole swath of cheating can be eliminated, or its worst effects limited, by switching to predictive methods. Put out a model in a form that can be checked by anybody, and it will be checked. Plus, you have to work (massage, tweak, manipulate) about four to eight times harder to make the results look good. I mean, anybody can get a wee p-value, but it takes a real man to get a strong predictive result.
The only other thing of real interest is the “discovery” that fraud “requests were reported most often by younger biostatisticians.”
The implies that either the fraudsters looked at the younger biostatisticians and thought them vulnerable, or the older biostatisticians more often gave in (and did not admit coercion).
How sad to think that scientists are not as they are portrayed in the movies!