The article profiles the work of John Ioannidis, who has spent a career trying to show the world that the majority of peer-reviewed medical research is wrong, misleading, or of little use. Ioannidis “charges that as much as 90 percent of the published medical information that doctors rely on is flawed…he worries that the field of medical research is so pervasively flawed, and so riddled with conflicts of interest, that it might be chronically resistant to change—or even to publicly admitting that there’s a problem.”
“The studies were biased,” he says. “Sometimes they were overtly biased. Sometimes it was difficult to see the bias, but it was there.” Researchers headed into their studies wanting certain results—and, lo and behold, they were getting them. We think of the scientific process as being objective, rigorous, and even ruthless in separating out what is true from what we merely wish to be true, but in fact it’s easy to manipulate results, even unintentionally or unconsciously. “At every step in the process, there is room to distort results, a way to make a stronger claim or to select what is going to be concluded,” says Ioannidis. “There is an intellectual conflict of interest that pressures researchers to find whatever it is that is most likely to get them funded.”
Most medical studies—and most studies in other fields—rely on statistical models as primary evidence. The problem is that the way these statistical models are used is deeply flawed. That is, the problem is not really with the models themselves. The models are imperfect, but the errors in their construction are minimal. And since (academic) statisticians care primarily about how models are constructed (i.e. the mathematics), the system of training in statistics concentrates almost solely on model construction; thus, the flaw in the use of models is rarely apparent.
Without peering into the mathematical guts, here is how statistical studies actually work:
- Data are gathered in the hopes of proving a cherished hypothesis.
- A statistical model is selected from a toolbox which contains an enormous number of models, yet it is usually the hammer, or “regression”, that is invariably pulled out.
- The model is then fit to the data. That is, the model has various drawstrings and cinches that can be used to tighten itself around the data, in much the same way a bathing suit is made to form-fit around a Victoria’s Secret model.
- And to continue the swimsuit modeling analogy, the closer this data can be made to fit, the more beautiful the results are said to be. That is, the closer the data can be made to fit to the statistical model, the more confident that a researcher is that his cherished hypothesis is right.
- If the fit of the data (swimsuit) on the model is eye popping enough, the results are published in a journal, which is mailed to subscribers in a brown paper wrapper. In certain cases, press releases are disseminated showing the model’s beauty to the world.
Despite the facetiousness, this is it: statistics really does work this way, from start to finish. What matters most, is the fit of the data to the model. That fit really is taken as evidence that the hypothesis is true.
But this is silly. At some point in their careers, all statisticians learn the mathematical “secret” that any set of data can be made to fit some model perfectly. Our toolbox contains more than enough candidate models, and one can always be found that fits to the desired, publishable tightness.
And still this wouldn’t be wrong, except that after the fit is made, the statistician and researcher stop. They should not!
Consider physics, a field which has far fewer problems than medicine. Data and models abound in physics, too. But after the fit is made, the model is used to predict brand new data, data nobody has yet seen; data, therefore, that is not as subject to researcher control or bias. Physics advances because it makes testable, verifiable predictions.
Fields that make use of statistics rarely make predictions with their models. The fit is all. Since any data can fit some model, it is no surprise when any data does fit some model. That is why so many results that use statistical models as primary evidence later turn out to be wrong. The researchers were looking in the wrong direction: to the past, when the should have been looking to the future.
This isn’t noticed because the published results are first filtered through people who practice statistics in just the same way.
Though scientists and science journalists are constantly talking up the value of the peer-review process, researchers admit among themselves that biased, erroneous, and even blatantly fraudulent studies easily slip through it. Nature, the grande dame of science journals, stated in a 2006 editorial, “Scientists understand that peer review per se provides only a minimal assurance of quality, and that the public conception of peer review as a stamp of authentication is far from the truth.” What’s more, the peer-review process often pressures researchers to shy away from striking out in genuinely new directions, and instead to build on the findings of their colleagues (that is, their potential reviewers) in ways that only seem like breakthroughs…
Except, of course, for studies which examine the influence of climate change, or for other studies which are in politically favorable fields: stem cell research, AIDS research, drug trials by pharmaceuticals, “gaps” in various sociological demographics, and on and on. Those are all OK.
Incidentally, predictions can be made from statistical models, just like in physics. It’s just that nobody does it. Partly this is because of expensive (twice as much data has to be collected), but mostly it’s because researchers wouldn’t like it. After all, they’d spend a lot of time showing what they wanted to believe is wrong. And who wants to do that?