Update This post is mandatory reading for those discussing global average temperature.
I mean it: exceedingly brief and given only with respect to a univariate time series, such as operationally define global average temperature (GAT). Let him that readeth understand.
GAT by year are observed (here, assumed without error). If we want to know the probability that the GAT in 1980 was lower than in 2010, then all we have to do is look. If the GAT in 1980 was less than the GAT in 2010, then the probability that the GAT in 1980 was lower than in 2010 is 1, or 100%. If you do not believe this, you are a frequentist.
Similarly, if you ask what is the probability that the GAT in the 2000s (2001- 2010) was higher than in the 1940s (1941-1950), then all you have to do is (1) define an operational definition of higher, and (2) just look. One such operational definition is that the number of years in the warmer decade outnumber the number of years in the cooler decade. If the number of warmer years in the 2000s outnumber the tally of warmer years in the 1940s, then the probability that the 2000s were warmer than the 1940s is 1, or 100%
There is no model needed to answer these or similar simple questions.
If you want to ask what is the probability that the GAT increased by at least X degrees C per year from 1900 to 2000, then all you have to do is look. If the GAT increased by at least X degrees C per year from 1900 to 2000, then the probability that the GAT increased by at least X degrees C per year from 1900 to 2000 is 1, or 100%. There is no need, none whatsoever, to ask whether the observed increase of at least X degrees C per year was “statistically significant.” The term is without meaning and devoid of interest.
At this writing, the year is 2011, but the year is incomplete. I have observed GATs from at least 1900 until 2010. I want to know the probability that the GAT in 2011 (when complete) will be larger than the GAT (as measured) in 2010. I cannot observe this now, but I can still compute the probability. Here is how.
I must propose a model which relates the GAT to time. The model can be fixed, meaning it assumes that the GAT increases X degrees C a year: by which it means, it does not increase by X – 0.1, nor by X + 0.3, nor by any other number besides X. In my model, in 2011 the predicted GAT will be the GAT as it was in 2010 plus X. Conditional on this model—and on nothing else—the probability that the GAT in 2011 is larger than the GAT in 2010 is 1, or 100%. This is not necessarily the same probability that the eventually observed GAT in 2011 is larger than the GAT in 2010.
It is easy to see how I might adjust this fixed model by assigning the possible increase to be one of several values, each with a fixed (in advance) probability of occurring. I might also eschew fixing these increases and instead assume a parametric form for the possible increases. The most commonly used parametric form is a straight line (which has at least three parameters; there are different kinds of straight lines used in time series modeling). How do I know which kind of parametric model to use? I do not: I guess. Or I use the model that others have used because conformity is both pleasing and easy.
I choose the straight line which has, among its parameters, one indicating the central tendency of a probability distribution related to—but is not—the increase in GAT through time. To call this parameter the “trend” can only cause grief and misunderstanding. This parameter is not, and cannot be, identical with the observed GAT.
Bayesian statistics allows me to say what values this parameter (and all the other parameters) is likely to take. It will allow me to say that, if this model is true and given the past years’ GATs, then the probability the parameter is greater than 0 is y, or Y%. This is the parameter posterior distribution. Suppose that y = 0.9 (Y = 90%). Can I then answer the question what is the probability that the GAT in 2011 is larger than the GAT in 2010? NO. This is the only probability that means anything to me, but I cannot yet answer it. What if y = 0.999999, or however many 9s you like: can I then say what is the probability the GAT in 2011 is larger than the GAT in 2010? No, no, and no, with just as many “no”s as 9s. Again, “statistical significance” of some parameter (mistakenly called “trend”) is meaningless.
However, Bayesian statistics allows me to take the parameterized model and to weight it by each possible value of the parameters. The end result is a prediction of the possible values of the GAT in 2011, complete with a probability that each of these possible values is the true one, assuming the model is true. This is the posterior predictive distribution; it is free of all parameters and only speaks in terms of observables, here year and GAT.
I can use the posterior predictive distribution and directly ask what is the probability that the GAT in 2011 is larger than the GAT in 2010. This probability assumes the model is true (and assumes the previous values of GAT are measured without error).
If I have more than one model, then I will have more than one probability that the GAT in 2011 is larger than the GAT in 2010. Each probability assumes that the model that generated it is true. Which model is really true? I can only judge by external evidence. This evidence (or these premises) tell me the probability each model is true. I can then use these probabilities, and the probabilities that the GAT in 2011 is larger than the GAT in 2010, to produce a final probability that the GAT in 2011 is larger than the GAT in 2010. This probability is not conditional on the truth of any of the models.
But it still is conditional on the premise that at least one of the models in our set is true. If none of these models in our set is true—which we could only know using external evidence—then the probability that the GAT in 2011 is larger than the GAT in 2010 is likely to be wrong (it still may be right by coincidence).
I hope you can see that I can ask any question about the observables prior to 2011 and that in 2011. For example, I can ask what is the probability that the GAT in 2011 is Z degrees C higher than in 2010. Or I can ask, what is the probability that the GAT in 2011 is W degrees C higher than the average of the years 2001-2010. And so on.
This is how Richard Muller’s group should issue their statements on the GAT.