Longtime reader Nate Winchester found a discussion—among the many, many—of global warming data revolving around statistics, from which we take the following snippet:
You just do the statistics on the data. If you calculate trends over a short period you don’t get statistically significant trends, and over a longer period you do get statistically significant trends. This is true for almost any real life data, and how long it takes for trend to show up over short term non-trended variation will depend on the data.
In the case of global temperature anomalies, it turns out that the trends in temperature become statistically significant over scales of roughly 15 to 20 years or more, and lack significant trend over shorter scales. That’s just a description of what global anomalies are doing.
Where this quotation originated is not important; probably you can find one nearly identically worded at any site in which the subject of climate change arises. But it is a useful comment, because it betrays a standard misinterpretation of statistics which we can here put right.
Suppose in front of you is a picture of a number of dots, one per year arranged sequentially, each dot representing, say, a temperature. Obviously—yes, truly, obviously—those temperatures, assuming they were measured without error, came from somewhere. That is, something, some physical process or processes, caused the temperatures to take the values they did.
They did not appear “randomly”, if by use of that word you mean some vague and mysterious metaphysical engine (run by quantum gremlins?) which spit the temperatures out for humanity to discover. But if by that word you merely mean that you do not know or do not understand what physical process caused the temperatures, then you speak intelligently.
Our second supposition requires us to weakly anthropomorphize either all, or individual portions, of the dots. You have to squint at the collection and say to yourself, “Say, if I draw a straight line running amidst the dots between year A and year B, most of those dots will lie close to the line, though only very few will touch the line.” You are allowed to draw various lines through the dots, some pointing upwards, some downwards, as long as all the lines connect head to foot, starting at the first year and ending at the last.
Once done, you can reach into your bag of statistical tricks and then ask whether the lines you have drawn are “statistically significant.” The first step in this journey to amazement requires you return to the word “random” and invoke it to describe the behavior of the dots not lying on the line. You have to say to yourself, “I know that nature chose to make the temperatures lie on this line. But since they do not lie on the line, only close to it, something else must have made the dots deviate from the line. What this cause is can only be the normal distribution.”
In other words, you have to say you already know that nature operates in straight lines, but that something ineffable steers your data away from purity. The ineffability is supplied by this odd who-knows-what called the normal distribution, the exact nature and of motivations of which are never clear.
Another thing that isn’t quite clear is the slope of the line you drew. It is a line, though; in that you are certain sure. But perhaps the line points not so nearly high; rather, it might lie flat. Must be a line, though. Has to be. After all, what else could it be?
Now, with all these suppositions, surmises, and say-whats in hand, you feed the dots into your favorite statistical software. It will churn the dots and compute a statistic, and then tell you—the whole point of the article has now come upon us, so pay attention—it will tell you the probability of seeing a statistic larger than the one you actually got given your line theory and your ideas about randomness are faultless (I ignore mentioning infinite repetitions of data collection).
If this probability is small, then you are allowed to say your line is “statistically significant.” Further, you are allowed to inform the media of this fact, a tidbit for which they will be grateful.
Of course, saying your imaginary line(s) are “statistically significant” says nothing—not one thing—about whether your line(s) are concrete, whether, that is, they describe nature as she truly is, or whether they are merely figments of your fervid imagination.
The best part of this exercise, is that you can ignore the dots (reality) entirely.