Here is an atmospheric monthly average temperature series: T = (61, 69, 69, 70, 72, 65, 63) (all F). What caused the temperature to take the value T1 = 61?
Well, the sun, the amount of moisture in the air, especially in the form of clouds, the characteristics of the land around the measurement device, and things of that nature. Not just one cause, but many contributing causes. The temperature, after all, is a bunch of air molecules (you know what I mean) jostling around the surface of a thermometer. And every air molecule didn’t get pushed in precisely the same way.
And since these are monthly averages of daily averages, which themselves are averages of hourly measurements, it is difficult to identify clearly all the causes that went into that 61. Regardless of our ignorance, it has some cause or causes.
So what caused the temperature to take the value T2 = 69? Was it T1?
Before answering, consider that you took two air temperature measurements one second apart, with values, say, 68 and 68. Did the first 68 cause the second 68? How much can change in a second? Air can “entrain” in, a fancy way of saying the wind can blow new air onto your thermometer. Solar radiation, mediated by clouds, can change that quickly, but maybe not enough to cause the air to change by more than 1 degree.
But since it is tiny air molecules that make temperature, and several things can and do change those molecules, which are moving at enormous speeds, the first 68 did not cause the second 68. Instead, the air molecules and the forces acting on them are the cause of both temperatures. It then becomes clear that in our monthly series T2 was not caused by T1.
What if I were to suggest that the series was caused by some mechanism or mechanisms that actually first made each monthly temperature 67 and then pushed it upwards or downwards from there? Thus the “real” series is T = 67 + (-6, 2, 2, 3, 5, -2, -4). This could happen. It might be that some physical force loves the number 67 so much that it returns all values there before moving them on to other pastures. Nothing we know of physics gives any support for this curious view, however; indeed, it is rather silly.
If the values of T are similar, it’s much more likely that the causes of the temperatures haven’t themselves varied much and are operating along more-and-less understood pathways (such as a bent earth swirling around the sun once each anno Domini).
Our series is only 7 members long. What if we wanted to guess T8? On the theory that the series starts low, comes up a bit, then descends, we might say, given only this information, the chances that T8 is 62 or lower are fair. And that’s about as good as we can do.
Notice very, very carefully that the information we’re conditioning our guess on tacitly includes facts about seasonality and such like. When we say the chance is “fair”, we’re not saying anything directly about the cause; instead, we’re giving a measure of our uncertainty of T8 based on effects possibly and hopefully related to its causes and not its causes themselves. If we knew the cause(s), we would know what T8 would be and we wouldn’t need our crude statistical model.
Since our word-model—and it is a statistical model—is crude, it’s unlikely to meet peer review. Why not something better? How about fitting a sine wave to account for seasonality, which we suspect on physical ground is associated with the cause of changing solar irradiance? Slicker yet, a generalized autoregressive conditional heteroscedastic time series beauty, complete with a built-in trend parameters? Or if we want to be cute, we can go with a machine learning or neural net algorithm which “learns” from the data.
Doesn’t matter. Pick any model you like, and call it M. Unless M incorporates every cause of T, it will be probabilistic and not physical, even though it may contains portions of causes; say, radiative transfer at the molecular level, or fluid flow dynamics; even our word-model was partly physical. M is thus no different in essence than our crude word-model, except that M will likely be able to better quantify the probability T8 takes some value.
We know from observation that T6 = 65 and T7 = 63. Given these observations, what is the probability of two months equal to or under 65? Well, it’s 1, or 100% because, I feel I must emphasize, we saw two months equal to or under 65.
Now this is a separate, entirely different probability to this separate, entirely different question: given M, what is the probability that T8 and T9 are less than or equal to 65? And that is separate, entirely different question than, given M, what is the probability that two months out of the next N are less than or equal to 65?
If we want to know what happened, we just look. (Repeat that thrice.) If we want to know what’s going to happen, or, at least, if we want to quantify our uncertainty in what will happen, we need M. Conversely, we do not need M to give probabilities of anything that already occurred. That would be silly.
Probabilities can’t cause anything
Whatever question we have about the future, it should be blazingly obvious that the choice of M dictates these probabilities: different M’s will (probably) give different probabilities. M is not a cause of T, nor do its probabilities cause T, and nor are past values of T causes of future values. M is an encapsulation of our limited knowledge and nothing else.
Now suppose my M said that the probability of two months in a row less than or equal to 65 was very small; make it as small as you like as long as it isn’t 0. Of course, we saw two months in a row meeting this criterion. We’d thus be right to suspect that M wasn’t any great shakes as a model since it gave such a low probability to an event we saw. On the other hand, this is far from proof against M. It could be that this kind of event really is rare. It would, incidentally, be the grossest mistake to say these events cannot happen because, of course, we saw that they can.
It could be that our ignorance of the cause(s) of T is so great that any reasonable M we can think of would say the probability of two low months is small. But then, because we have a big tool bag, it usually isn’t difficult to find some model which would say streaks like this are common. So, given our observations, which M would you prefer? One that said the probability of streaks was small or another that said it was not?
Here’s the bad news. It’s impossible to know which non-causal model is best given just the observed data, especially considering the non-causal models were derived from the observed data. The only real way to know non-causal model quality is to use each M to predict never-before-seen data, and then to compare the predictions with what happened. Statements of how well your M “fit” the observed data are not interesting, especially considering that it is always possible (I won’t prove this here) to find an M which fits any observed data perfectly.
What are the Chances?
And now a practicum in the form of a peer-reviewed paper, “Warm Streaks in the US Temperature Record: What are the Chances?” by Peter Craigmile and a few others in JGR: Atmospheres. The authors examined NOAA’s National Climatic Data Center’s monthly contiguous US average monthly temperatures and discovered that temperatures for sixteen consecutive months were above some (arbitrary) threshold (this data’s upper tercile).
They wanted to know the probability this could happen. As above, given that we have seen it, the probability is 1. But Craigmile and friends, conditioning a couple of M, say it is low, and conditioning on a couple of other M, say it not low. Actually, because of mathematical considerations, they calculate the probability of streaks of 16 or more months in some hypothetical future of arbitrary length of time.
They use fancy M—only the best—and the probabilities were estimated by simulation, which surely sounds impressive. But these must, as we learned above, be the probabilities of events that have not yet happened. They cannot be of events that have already happened, because the probability of these events are 1. Here’s what they say:
Our resulting calculations imply that in the absence of a trend, the probability of a 16-month or greater streak is likely in the range of 0.02 to 0.04, which is small, but not so small as to make the event completely implausible. When a linear trend of the magnitude observed in the historical record is included, this probability increases to the range of 0.06 to 0.15. Even larger probabilities were obtained when nonlinear trend models were considered…
Overall, the paper shows that in the absence of trend, the probability of a streak of 16 consecutive top-tercile events is low enough that one could legitimately query its plausibility. When either a linear or non-linear trend is included, the probability increases to the point where such a result is not out of the ordinary.
A physical trend would be a systematic, uniform change in the causes which drive T. A statistical model trend is just a parameter in a model indicates change through time. The parameter is not a cause and could only be said to represent the real cause accidentally. Also, some of their M have a parametric trend and some don’t.
What the future holds
Which of these models is the best in judging the probability of the 16-month streak? None. The observations are enough to tell us all we need know about what already happened. But which model is best for future data? I have no idea. Nobody does. And we won’t know until we wait until each of the models makes predictions of future data, which can be used to check their predictions.
That won’t happen. Statisticians usually aren’t interested in actual, real-life model performance, preferring instead to make pseudo-statements about past data, such as implying model probabilities are for already-observed data. Most statistics is hit and run.
The elephant in the paper is global warming—please don’t say “climate change”—and the hints are that the 16-month streak was caused by it. Maybe so. But we learn zero about this theory from the non-causal statistical models used. Are the “trend” non-causal models really like the actual changes in physical causes (which are supposed to be linear) due to global warming? Who knows? The models themselves can’t tell you, nor can pseudo-probabilities about already observed data derived from them.
Indeed, the authors, all smart, well-employed statisticians, are too clever to outright claim that it is global warming which caused the observed streak. But they’re just political enough to leave the implication that it did hanging in the air.