Climate Science And Significance: Wall Street Journal Takes On Statistics

Coin Flips and Dice Rolls

Doug Keenan has a must-read piece in the Wall Street Journal on the time series analysis of global temperature (Thanks to Randy, Roger, and John for the links).

I differ from Keenan on the value of “significance” (see the next story), but that difference makes no difference here. Significance or lack of it isn’t the real story: the model that represents our uncertainty in the temperature is.

Something caused the temperature to take the values it did. That something is the physics of heat and radiative transfer and so forth, processes so complicated that no climate model has come close to reproducing the observed values of temperature (climate models have produced simulations that mimic certain statistical properties of the observed values, which is not at all the same thing).

Since the physics is unworkable, climatologists turn to statistics—always a dangerous ploy. This is because there are an infinite number of statistical models that can reproduce the observed values of the time series. Most of these models will be useless, however, in predicting not-yet-observed values of the time series. Even worse, the decision of “statistical significance” changes with the model. Using one model, the data is “statistically significant”; use another it is not. See Doug’s article for how this works for the global temperature series.

Want the best kept secret in statistics? You can always—as in every single time without exception—find a model which says your data is “statistically significant.” That model will almost certainly be worse than useless in predicting new data, though. If your model is correct, it ought to be able to predict new data, the only true test of model correctness.

But if people aren’t asking about predictions (and most don’t) and merely want to know if their belief about what might have caused the data to take the values it did, then you can always pronounce your belief “statistically significant,” because you can always find a model that produces “significance.”

It turns out for temperature data one very simple model says “Significant!”, but other, more complex models, say “Not significant.” Again, this is no surprise: this will always be true regardless of the data. But the model used by the IPCC is a very simple and not very realistic one. Keenan finds other models that fit better and that pass certain rule-of-thumb tests time series modelers use.

Keenan isn’t claiming his model is certainly right and the IPCC’s is certainly wrong. But it is clear that the IPCC was not especially vigorous in investigating plausible alternatives. And why should they? They had their “significance”, which is all they really wanted.

Supreme Court Statistics

Read Carl Bialik’s Wall Street Journal piece “Making a Stat Less Significant“, about how the question of statistical significance has reached the Supreme Court of these fine United States.

In a case before the court the justices said

companies can’t only rely on statistical significance when deciding what they need to disclose to investors.

Amen, say several statisticians who have long argued that the concept of statistical significance has unjustly overtaken other barometers used to determine which experimental results are valid and warrant public distribution. “Statistical significance doesn’t tell you everything about the truth of the hypothesis you’re exploring,” says Steven Goodman, an epidemiologist and biostatistician at the Johns Hopkins Bloomberg School of Public Health.

A point on which most statisticians agree is that statistical significance is difficult to explain.

I’ll echo that Amen, brother Goodman. Statistical significance does not indicate the truth or falsity of the hypothesis you’re exploring. Instead, it tells you the probability of seeing the value of an hoc statistic larger than the one you actually saw given you make some rather curious assumptions about the probability model used. And, most crucially, given the model used is true (see the story above).

The question here was, “Does this drug produces harmful side effects?” Bayesian statistics can answer that question, or at least put a probability value on it, but frequentist statistics (which uses statistical significance as a measure) must remain mute.

The story of how a philosophy of evidence that can’t answer direct questions came to dominate science is fascinating. Regular readers will already know the answer, but newcomers will want to click on the Stats/Climate link at the top of the page and read the relevant articles.

Thanks to Bernie and several other readers who provided this link.


  1. Speed

    For those without a Wall Street Journal subscription, do a Google search on the title (How Scientific Is Climate Science?). Google will provide a link to a free and complete version of the article.

  2. stan


    Have you seen the video of Dr. Courtillot? In it he discusses the step changes in temps (e.g. 1998) and argues that it is a mistake to try to fit trends when the data is better understood as basically flat between instances of step changes.

    Do you have any interst in fleshing out the differences and importance b/w his view of step changes and the warmist trend analysis?

  3. JH

    Now we see that the similarity between the two lines is strong: one excellent piece of evidence that ice ages are indeed caused by orbital variations.

    Dear Mr. Doug Magowan,

    I hope you’ll read this comment. I have found someone who will appreciate your conclusion that “there was an uncanny relationship between the Giants place in the chase for the NL West Division and the performance of the Dow Jones industrial average” because of the similarity between the two lines.

  4. JH

    Now we see that the similarity between the two lines is strong: one excellent piece of evidence that ice ages are indeed caused by orbital variations.

  5. Briggs


    The difference of course being the one case there was no known causal relationship and in the second case there is.

  6. JH

    Well, “similarity between the two lines” is the reason and evidence provided!

    I can’t be sure if Diagnostics and model checking haven’t been done in the IPCC reports. Just as I can’t be sure about Mr. Keenan’s claim that “the IPCC’s conclusions about the significance of the temperature changes are unfounded.”

    You can always—as in every single time without exception—find a model which says your data is “statistically significant.”

    I know time series analysis well, but, how about teaching me how to do the above?

  7. Noblesse Oblige

    Stan 10;27 AM: This is the Courtillon video : There is also some very good work by the Wisconsin mathematics group (Tsonis et al and subsequent work). They model a set of decadal and multidecal ocean cycles as a coupled anharmonic oscillators and show that they undergo chaotic shifts in their synchronization corresponding to observed climate shifts c. 1910, 1940, 1970 and perhaps 2000.

    The message is that fitting trends based on linear models over decades is not the way to go. And of course this is not captured by the models.

  8. commieBob

    Professor Briggs

    I have a statistics question:

    I’m building an instrument (a spectrometer) which produces two values: wavelength and intensity. The wavelength errors don’t matter much. The intensity errors are a bit of a mystery to me and are reasonably large (1% to 5%). For random noise, experience says that averaging the signal for long enough will reduce the error to almost zero. That doesn’t seem to be the case here.

    So, the question is: assuming that the error isn’t due to random noise, what is the best/proper way to express the error? In electronics, we express almost everything in terms of RMS. 😉 (Some of the time I suspect that we do it out of habit rather than out of a deep understanding of the underlying processes.) My naive reading of the applicable wiki pages leads me to believe that I should be using mean absolute error. If you can answer my question or point me to an easy-to-understand-but-accurate reference I would be most grateful.

  9. a jones

    Yes I read this and whilst much is good some is misleading. For example the author suggests that if you toss a coin and it comes up ten heads in a row you might suspect it was not truly random.

    Of course you might but of itself it is no reason to suppose that the coin and toss is not true, like a roulette wheel or a die a coin is merely a random number generator, it has NO MEMORY. And to paraphrase: given all the coins tossed in all the world at any time how uncommon would a run of ten heads be?

    Where the author is of course right is that if assume there is memory the system is not random. At first sight this seems reasonable, if there is a hot day much heat is stored so the next day may well be hotter than it would otherwise be.

    The danger with such assumptions, as the author points out is that they are arbitrary, in this case one year.

    Frankly I cannot even bother to go further, Mr. Briggs, you know as well as I, and indeed as you point out from time to time, the difficulties, this misunderstandings and the futility. Likewise.

    Kindest Regards.

  10. DAV

    a jones,

    BIAS does not imply MEMORY. As for whether the occurrence a once in a thousand event is sufficient reason to suspect bias, YMMV. When playing with dice, etc. it’s often prudent to NOT extend the benefit of the doubt. How long would it take for yourself to begin to suspect bias?


    You might want look into the Cauchy distribution and its applications in spectroscopy. Since you’re building the sensor I assume you know all about aliasing and the like.

  11. heystoopid

    Very poor analogy and one that truly fails to capture reality. F minus!

Leave a Reply

Your email address will not be published. Required fields are marked *