Coin Flips and Dice Rolls
Doug Keenan has a must-read piece in the Wall Street Journal on the time series analysis of global temperature (Thanks to Randy, Roger, and John for the links).
I differ from Keenan on the value of “significance” (see the next story), but that difference makes no difference here. Significance or lack of it isn’t the real story: the model that represents our uncertainty in the temperature is.
Something caused the temperature to take the values it did. That something is the physics of heat and radiative transfer and so forth, processes so complicated that no climate model has come close to reproducing the observed values of temperature (climate models have produced simulations that mimic certain statistical properties of the observed values, which is not at all the same thing).
Since the physics is unworkable, climatologists turn to statistics—always a dangerous ploy. This is because there are an infinite number of statistical models that can reproduce the observed values of the time series. Most of these models will be useless, however, in predicting not-yet-observed values of the time series. Even worse, the decision of “statistical significance” changes with the model. Using one model, the data is “statistically significant”; use another it is not. See Doug’s article for how this works for the global temperature series.
Want the best kept secret in statistics? You can always—as in every single time without exception—find a model which says your data is “statistically significant.” That model will almost certainly be worse than useless in predicting new data, though. If your model is correct, it ought to be able to predict new data, the only true test of model correctness.
But if people aren’t asking about predictions (and most don’t) and merely want to know if their belief about what might have caused the data to take the values it did, then you can always pronounce your belief “statistically significant,” because you can always find a model that produces “significance.”
It turns out for temperature data one very simple model says “Significant!”, but other, more complex models, say “Not significant.” Again, this is no surprise: this will always be true regardless of the data. But the model used by the IPCC is a very simple and not very realistic one. Keenan finds other models that fit better and that pass certain rule-of-thumb tests time series modelers use.
Keenan isn’t claiming his model is certainly right and the IPCC’s is certainly wrong. But it is clear that the IPCC was not especially vigorous in investigating plausible alternatives. And why should they? They had their “significance”, which is all they really wanted.
Supreme Court Statistics
Read Carl Bialik’s Wall Street Journal piece “Making a Stat Less Significant“, about how the question of statistical significance has reached the Supreme Court of these fine United States.
In a case before the court the justices said
companies can’t only rely on statistical significance when deciding what they need to disclose to investors.
Amen, say several statisticians who have long argued that the concept of statistical significance has unjustly overtaken other barometers used to determine which experimental results are valid and warrant public distribution. “Statistical significance doesn’t tell you everything about the truth of the hypothesis you’re exploring,” says Steven Goodman, an epidemiologist and biostatistician at the Johns Hopkins Bloomberg School of Public Health.
A point on which most statisticians agree is that statistical significance is difficult to explain.
I’ll echo that Amen, brother Goodman. Statistical significance does not indicate the truth or falsity of the hypothesis you’re exploring. Instead, it tells you the probability of seeing the value of an hoc statistic larger than the one you actually saw given you make some rather curious assumptions about the probability model used. And, most crucially, given the model used is true (see the story above).
The question here was, “Does this drug produces harmful side effects?” Bayesian statistics can answer that question, or at least put a probability value on it, but frequentist statistics (which uses statistical significance as a measure) must remain mute.
The story of how a philosophy of evidence that can’t answer direct questions came to dominate science is fascinating. Regular readers will already know the answer, but newcomers will want to click on the Stats/Climate link at the top of the page and read the relevant articles.
Thanks to Bernie and several other readers who provided this link.