Anthony Watts over at Watts Up With That?—incidentally, a blog title infinitely superior to “William M. Briggs, Statistician”—asked me to comment on Joe D’Aleo and Don Easterbrook’s new paper, “Multidecadal tendencies in ENSO and global temperatures related to multidecadel oscillations.” The title of this post is based upon Watts’s.
Before getting to it, let me head off a criticism sure to be leveled at D&E, one which is a logical fallacy. It does not matter that their paper has been released unto the aether and that it has not gone through the tempering process of peer review. That is, their results are not false because a hostile editor did not have a chance to reject them.
Anybody who thus shouts, “No peer review!” to reject D&E has fallen prey to the “I want it to be false, therefore it is false” fallacy. When we hear this argument, it tells us more about the person making it than it does about the thesis under consideration. A truth is true wherever it is spoken. I’m surprised at how often this has to be pointed out.
AMO & PDO
First some statistical truths. The Atlantic Multidecadal Oscillation (AMO) is a function of sea surface temperature. The brother Pacific Multidecadal Oscillation (PDO) is also a function of sea surface temperature. They are called “mutidecadal oscillations” because these indexes have been found empirically to bounce up then down on something like a decadal time scale.
The bouncing is not perfectly regular nor entirely predictable, the Earth being what she is. Since the AMO and PDO are functions of sea surface temperature, the mechanisms that cause temperature to change will cause these indexes to change. Examples of mechanisms: solar output, continental shift, land-use changes, atmospheric gas changes, and so forth. Note carefully that none of these are themselves functions of temperature. What about ocean currents? Well, those don’t sound like temperature, but they’re closely related because currents change partly in relation to how air temperature changes. That is, changes in air temperature are naturally correlated with changes in ocean circulation.
Obviously, sea surface temperature, thus AMO and PDO, must be highly correlated with air temperatures. When AMO or PDO goes up or down, we should expect that, on average, air temperature will go up or down. There is no mystery why this is so. Thus, direct correlations of the AMO and PDO and air temperature are of little use.
Since AMO is a function of sea surface temperature and PDO is a function of sea surface temperature, if either of these functions of sea surface temperature did not correlate with air temperature, we should be very worried. But about our analysis, not the climate. Of course, adding two functions of sea surface temperature together—AMO + PDO—produces yet one more function of sea surface temperature.
These indexes can be of some predictive use for air temperature, but only if the mechanisms that cause changes in sea surface temperature precede changes in air temperature. Loosely, in how sea surface temperatures now will drive air temperatures later. The usefulness is limited because many of the mechanisms that drive sea surface temperatures also drive changes in air temperatures. See below for what this means.
Now, the Atlantic is one side of the world, the Pacific the other. The AMO thus is a proxy for sea surface temperatures on one side, the PDO the other. Added together, we should have a rough idea of the sea surface temperature over the whole world; at least, the addition would almost certainly be a better measure than either alone.
Smoothed Time Series
That picture is Figure 18 from D&E and shows the average annual air temperature around the States and the AMO + PDO. These two curves are highly correlated. They had better be, for two reasons. The first is the one we have been talking about. The second is because D&E have made a fundamental mistake in their analysis: they have smoothed the data before plotting.
This should never be done. Anthony quotes from me in my article Do Not Smooth Time Series, You Hockey Puck!. You can read the whole thing: but the gist is that smoothing always increases correlation. Here is our recipe for generating spurious results:
- Start with two absolutely unrelated time series which show no correlation,
- Smooth one or both series,
- Recompute the correlation;
- If the correlation is not yet “statistically significant”, repeat 2 and 3 until it is.
This recipe is guaranteed. Correlation may be computed via linear regression, as D&E did, or by another other parametric statistical model. See their Figures 19 and 20 for confirmation.
I want to stress that if D&E did not smooth their data, the correlation would not have been as high; but as high as it would have been, it would still have been expected. All that smoothing has done here is artificially inflated the confidence D&E have in their results. It does not change the fact that AMO + PDO is well correlated with air temperature.
The ocean is sluggish in response to external forcing, the atmosphere responds as quickly as a cheerleader returning a text. These two facts are why sea surface temperature indexes can be useful in predicting future air temperature. If some mechanism causes the sea surface temperature to increase in, say, the Pacific, then cetaris parabis the air above the sea it will eventually requite and warm itself.
If it were not for that ceteris parabis, knowing the sea surface temperature now would tell us exactly what the air temperature would be later. But all things are rarely equal, the Earth is not constant, the mechanism that drove ocean temperature change also drives air temperature changes, the air will also change the sea temperature, etc. At best knowing sea surface now only gives us minor evidence of what air temperature will be later.
Empirical studies of the usefulness of oceanic indexes statistically predicting air temperatures agree. These models are only useful in skillfully forecasting temperature for a few months ahead of time.
“The brother Pacific Multidecadal Oscillation (AMO)” should be “The brother Pacific Multidecadal Oscillation (PMO)” ?
A quibble: You state “The brother Pacific Multidecadal Oscillation (AMO) is also a function of sea surface temperature.” Pretty sure the acronym in brackets should be PDO.
Typo. Fixed. Thanks.
One more typo, Briggs-sensei: It’s ceteris paribus.
It does seem silly for WUWT to highlight this particular chart – given its statistical weaknesses. I thought the chart on page 22 of the paper was far more intriguing since it clearly suggests the continuation of a very regular warming/cooling/warming/cooling pattern.
It seems to me that D’Aleo and Esterbrook are demonstrating that given that PDO and AMO are as far as we can tell cyclical and are directly reflective of the earth’s energy balance, we can expect a period of cooling after a period of warming. At the same time we do not have a clear idea of why PDO and AMO are cyclical with a given periodicity – most likely the Sun – but there seems little relationship between the cycles and CO2.
“Pray” should be “prey” in line 9
Bernie, Ari, gcb,
It could also be, and it is at least not unlikely, that PDO is chaotic, or chaotic within bounds, and therefore of little predictive value. Let’s not forget: whatever is driving it, is also (mostly) driving air temperature.
Yep, chaotic within bounds does not make for great predictions. Also since there is no testable or even cogent theory as to the drivers of PDO and AMO then it is hard to justify any predictions. It was nice, however, to see the relatively large divergence from the IPCC projections. Since we cannot explain variations in the energy balance as reflected in PDO and AMO, predictions of future global temperature trends need to be more circumspect.
One of the problems with smothing that is more generally acknowledged in the signal processing world is the problem of phase shift. One may assume that because one is using a symmetric impulse response filter to convolve the the the underlying signal, there will be no phase shifts. While this is strictly true, the time domain behaviour, i.e.: does PDO+AMO have a consistent lead/lag with temperature may be obscured because the relatively high frequency components whose phase determine lead and lag have been filtered out. This stems from your 10.20 remark about attribution, since although there appears to be a correlation, and a reasonable hypothesis relating PDO & AMO to temperature, this does not elimate a common driver of both effects.
I am always rather surprised to see straightforward linear correlation of signals that may contain phase shifts (i.e.: delays). This analysis is probably better performed in the frequency domain so that the delay between the two signals can be established, although I grant that the length of signal in relation to the presumed fundamental frequency is rather short and would involve some rather arbitrary tapering of the signal.
Are the sea temperatures useful predictors? I haven’t read the article, but by eyeball it looks like the Sea temperatures lags the Mean US Temp. As you mention, you would want that the other way around for prediction.
It can be difficult to distinguish bad forecasts by eye, because it is hard to train yourself to look at vertical differences between series that track closely in value but with a lag. This is an informal version of what RC says above – I learned this lesson with my own forecasts.
Since I was the one who quoted you first in the comments Matt and pointed Anthony
at your analysis, the hat tip properly belongs to me.
Quoting myself, written 15 seconds after seeing the first slide. Matts admonition against smoothing should be branded into people’s brains. Nuff said.
Steven Mosher says:
September 30, 2010 at 10:18 pm
â€œFigure 18: With 22 point smoothing, the correlation of US temperatures and the ocean
multidecadal oscillations is clear with an r-squared of 0.85 â€
1. the underlying indices are based on SST. temperature
2. we already know the SST is fairly well correlated with the temps you see over land.
3. Ws Briggs https://www.wmbriggs.com/blog/?p=735
â€œSomebody at Steve McIntyreâ€™s Climate Audit kindly linked to an old article of mine entitled â€œDo not smooth series, you hockey puck!â€, a warning that Don Rickles might give.
Smoothing creates artificially high correlations between any two smoothed series. Take two randomly generated sets of numbers, pretend they are time series, and then calculate the correlation between the two. Should be close to 0 because, obviously, there is no relation between the two sets. After all, we made them up.
But start smoothing those series and then calculate the correlation between the two smoothed series. You will always find that the correlation between the two smoothed series is larger than between the non-smoothed series. Further, the more smoothing, the higher the correlation. The same warning applies to series that will be used for forecast verification, â€¦”
One graph did say it all. You do not prove anything by taking one times series that is built from SST temperatures and a second times series of temperatures, smooth them and then regress. Except that a smoothed times series of temperatures correlates with another smoothed series of temperatures.
Given the definition of the PDO and the AMO I would have thought that the smoothing was the least of the issues. Both these indicators use “de-trended” temperatures i.e. the warming trends have been removed so that you can better see the oscillations. The authors then use these indicators, from which warming has been removed, to show that there is no overall warming! Does this not strike anyone as odd? I am also suspicious that the authors say they “normalised” the curves, which I suspect means that they adjusted the heights to make it look as if they fit better.
Another example of this can be seen on the website
where the author plots the AMO, and other oscillations, against CO2 levels and claims it is a surprise that there is no correlation. Well of course there is no correlation – the warming trend from the AMO etc has already been removed.
Tsonis et al GEOPHYSICAL RESEARCH LETTERS, VOL. 34, L13705, doi:10.1029/2007GL030288, 2007 have a far more sophisticated analysis relating ocean cycle dynamics to climate shifts during the 20th century. They treat several ocean cycles as coupled anharmonic osciillators and track the degree of synchonicity between them. They find that changes in sychonicity correspond to observed changes in the earth’s global temperature trend.
They do not smooth.
There seems little doubt that these cycles have a strong influence on climate. They are not captured in climate models.
Ummm…aren’t the AMO/PDO indexes they’re using in that actually related more to pressure/wind values than they are on temperature? Temperature is just a side effect of mixing and stagnation.
I could be wrong though.
There is great doubt; or, rather, the sequence of causation is often confused. It is true that shifting ocean currents cause air temperature to change. And something is causing those ocean currents to shift. But many or most of the mechanisms that cause ocean current shifts also directly cause air temperatures to change. Linearity just isn’t in it.
Just want to point out that it is next to impossible to obtain sensor data that haven’t been subjected in some way to low pass filtering (smoothing). A temperature reading, for example, represents the average thermal energy of whatever is being measured. Your admonition not to use smoothed data in further analyses implies that NO sensor data should ever be used for analysis.
ALL sensor data are subject to noise which MUST be filtered if they are to be useful. Have you ever seen thermocouple readings taken off a high impedance bridge? It is only when the ‘noise’ is inherent (whatever that means) that further filtering should not be applied. In any case, one needs to start somewhere. How would you (personally) go about determining the source of the noise for instance?
So is the filtering before graphing really a “fundamental error”? In my view, it’s a judgment call.
My personal view of the physical world is CHAOS = DON’T-KNOW-HOW-TO-COMPUTE-EXACTLY and chaos is not an inherent property anymore than probability is.
In a certain sense it is a judgment call – but a judgment call that reflects your understanding of the system or phenomena being addressed. Surely the difference in the sensor data is that you know more about the likely sources of errors and their relationship to what you are measuring. Therefore, the filter is not removing relevant information it is removing irrelevant information. With climate data, it appears that we know so little that smoothing by definition is removing information which may be relevant or irrelevant – we do not know.
That’s tantamount to saying “we know very little so why bother trying?” I agree that one should be careful about conclusions based upon the analysis, though, as it doesn’t make a strongly convincing argument . Briggs’s admonition is really in the same class as those about causal inferences based upon correlation. It’s too much of good thing and not always true — a gray area and hardly black or white.
I also disagree with the apparent supposition of the existence of inherently chaotic systems. My personal feeling is that chaos is synonymous with randomness but that’s another topic.
1. Smoothing can be used to decipher the trend in a data series. However, are there practical/ interpretational reasons in climate science to regress one smoothed series on another smoothed series? I imagine (only because I have never done so when analyzing time series data or panel data) that the equation, appropriately postulated, representing the relation between the original two series could be different from the one between the two smoothed series. Need time to think about this.
Sometimes, practical interpretations win over mathematical reasons in statistical modelling.
2. What is the practical meaning of AMO + PDO?? Perhaps there is one. IDK since I am no expert in climate science. Technically, corr(AMO + PDO, TEMP) may not be larger than corr(AMO, TEMP) or corr(PDO, TEMP). That is, the addition may not be a better measure of TEMP based on the correlation criterion. I must have missed the correlations reported in the paper.
3. Why running a regression of TEMP on AMO+PDO (one predictor)??? This places the constraint that both AMO and PDO have the same coefficient. A regression of TEMP on AMO and PDO (two predictors, that is) would probably be better, at least in terms of R^2.
4. Do people in climate science bother to check diagnostic measures and the acf plot (correlogram) of residuals????
… are there practical/ interpretational reasons in climate science to regress one smoothed series on another smoothed series?
There most certainly are! What is smoothing if it isn’t a loss of precision? Almost every dataset, be it census, production or whatever, incorporates precision loss. It’s inevitable. It happens as soon as one selects a precision level. Theoretically, you could count every person in the U.S. but the reality is that some bits will be lost. It helps if you never forget that.
So, when you regress say pressure against temperature, you are literally regressing two smoothed series no matter how careful you are.
The real question is: how much smoothing is too much? That’s hard to answer.
I am guessing that a linear regression is the mother of all smoothings, except for maybe averaging both axis to a single point, but if you wanted to find the signal from global warming, could you do a LR on the PDO+AMO and the temp, then start them at the same point, then take the ending difference as the global warming signal? I ask out of ignorance, so please be kind.
And I mean do LRs on the unsmoothed data, not smoothed.
Correct, smoothing yields â€œnew dataâ€ with less variability. Yes, the monthly average temperature and monthly average pressure t are smoothed data.
Letâ€™s note one difference between â€œmonthly averageâ€ and â€œmoving average.â€ That is, the monthly averages can be said to have been observed at (equally spaced) distinct time points, which is an important assumption in statistical regression modeling. However, you wonâ€™t be able to claim that moving averages are observed at distinct time points.
If the validity of regressing one smoothed data series on another can be somehow statistically/mathematically justified, perhaps, then the question would be how much smoothing should be performed.
DAV: “I also disagree with the apparent supposition of the existence of inherently chaotic systems. My personal feeling is that chaos is synonymous with randomness but thatâ€™s another topic.”
It is perfectly possible for something to be chaotic in appearance and also deterministic at the same time, but perhaps I’m only saying the obvious. There is also the problem of not having enough math or initial condition precision to calculate the deterministic result and get the correct values. True random processes, however, are not deterministic.
I might also add that there is not only time based smoothing going on with the AMO+PDO and temperature values, but also spacial smoothing. What is happening in the west Pacific is not what happens in the east Pacific. We also have a problem with temperature in that storm systems move across space over time, so a warm mass of air will cause cycles in one area as it moves over it. Similar things happen with ocean currents moving warm and cold water over an area. I believe this is why you see so much smoothing going on, both in time and space, and there are also many problems with the way that it is done leading to biases in the data.
While it makes sense to not smooth the data in time, can the same be said for area?
RE: The ocean is sluggish in response to external forcing, the atmosphere responds as quickly as a cheerleader returning a text.
WHAT the #@@$%^*&@! does THAT mean?!?!?!?!
A typical “cheerleader” will return a text quickly or not or never based on the source. Which implies that this remark is intended as a sort of ‘random-response-generator.’
ON a more serious note, Jasper Kirkby’s lecture at CERN a while back presents some intriguing correlations that imply a solar cause to Earth’s warming/cooling effect (if you observe you’ll note he just stops short of declaring causality):
James Gibbons said:
‘It is perfectly possible for something to be chaotic in appearance and also deterministic at the same time â€¦True random processes, however, are not deterministic.’
However, the opposite can also be true i.e. something can look deterministic, but be chaotic. For example, a chaotic input to a pair of coupled oscillators can produce a resonant oscillation. Might the 10 year cycle simply be the result of phase synchronization in a coupled chaotic system?
â€œI also disagree with the apparent supposition of the existence of inherently chaotic systems. My personal feeling is that chaos is synonymous with randomness but thatâ€™s another topic.â€
All these are incorrect words .
The right classification is :
– deterministic systems (example a simple pendulum). They are governed by linear equations.
– deterministic chaos systems (example planetary orbits or a die) . Governed by non linear equations.
Both systems are deterministic but the former is not sensible to initial conditions while the latter is.
Most important property being that deterministic systems can be predicted for any time while deterministically chaotic systems cannot.
In the best case the deterministic chaos will leave some probability distribution invariant (a die) and a statistical interpretation will be possible and in the worst case there will be no invariant PDF (planetary orbit) and no statistical interpretation will be possible .
– random systems (example quantum mechanics) . This one is very tricky. Actually the Schrodinger equation is strictly deterministic. But we can’t observe the wave function . What we observe are eigenvalues of an operator and the one that we’ll observe will be random .
So the evolution of the operator will be deterministic but the eigenvalue that we’ll observe will be random.
Strictly speaking random dynamics don’t exist . The evolution can only be deterministic or deterministic chaos .
The only criminal mistake that one may make is to equate chaos with randomness;)
Chaos is always a property of solutions of some system of perfectly deterministic equations so it can be anything but random.
There may be some emergent statistical properties of solutions in some cases but the solutions themselves are never “random” .
TomVonk: “deterministic systems (example a simple pendulum). They are governed by linear equations.”
Pendulums are not linear! The force varies in a non-linear manner with the angle. A mass on a spring may be linear if the spring constant is linear.
However, I do mostly agree with the rest of what you said. And it seems that Einstein was wrong when he said “God doesn’t play dice with the universe.” I really think that we are missing something here because under a truly deterministic universe, time would be reversible, but it doesn’t appear to be. It is true that the scientific method (and statistics) assume determinism when the truth may be more subtle.
Pendulums are not linear! The force varies in a non-linear manner with the angle. A mass on a spring may be linear if the spring constant is linear.
You must pay attention on what is written and not what you think is written .
I didn’t say that pendulum was linear . I didn’t say anything about forces .
The point was to give a trivial example of a non chaotic system what a pendulum definitely is . On top deterministic non chaotic systems are often described by linear differential equations . That is btw also the case for the pendulum for small initial amplitudes . This is not the case for great amplitudes but it is irrelevant .
To be precise chaos => non linear equations is true but non linear equations => chaos is wrong .
That’s why chaos is not equivalent to non linearity .
It is true that the scientific method (and statistics) assume determinism when the truth may be more subtle.
Determinism just means that the dynamics obey some system of ODE or PDE what definitely all systems we know do .
That is called laws of nature .
The purpose of science is to find them so not looking for the laws of nature (eg not looking for determinism) would be something else than science .
The non scientific “truth” is certainly ill defined and probably not subtle at all .
I’m not suggesting touchy-feely science rules. But then things like the Pioneer Anomaly come along and make you wonder if everything is really peachy keen with the laws of nature.
It is clearly a mystery that the universe was built by a mathematician. (and that math explains it so well)
“But then things like the Pioneer Anomaly come along and make you wonder if everything is really peachy keen with the laws of nature.”
I don’t know where this unfortunate tendency to see mysteries and doubts about laws of nature proven by evidence of millions of experiments with accuracies vastly better than
10^-x , x being a rather large number.
I don’t know if the so called Pioneer Anomaly made you wonder but it certainly didn’t interest many of real physicists.
See : http://arxiv.org/abs/0710.2656 this is already 3 years old and doesn’t explain all of the anomaly but it’s just to give an idea how physicist treat things .
See also : http://motls.blogspot.com/2007/04/probabilities-of-various-theories.htm
Even if this kind of exercice needs an extremely deep knowledge of fundamental physics and a not quite objectively formalized approach, note especially that the probability of the following statement being true what implies consistency with all other firmly demonstrated statements is 2 % :
2% – The acceleration of the Universe or the Pioneer anomaly or similar observable effects may be explained by a dramatic modification of general relativity at very long distances comparable to the Hubble scale, or very small accelerations at the same scale
This also explains why almost all physicists prefer to spend their time on statements and problems that have a probability of 98 % to be relevant rather than 2% and that’s why almost no physicist is interested by the Pioneer “anomaly” .
You know , physics is not a domain where “anything goes” . In the AGW domain it does , but that’s why AGW is not physics .
Of course it is not the same thing for popular and mass media . They look on the contrary for very low probability but high wonder statements .
Saying the general relativity is a consistent theory of gravity and has been proven to be right with at least 99,9999…% probability is boring and doesn’t sell paper .
Interviewing a guy who is saying that he has just found an effect proving that “everything we thought up today about gravity was wrong” will set the mindless journalistic world in a frenzy even if they report about a supposed problem that is extremely unlikely (less than 2 %) to be true .
However if you belong to the people who like to spend their time with very low probability interpretations/problems instead of high probability interpretations , then you can follow the Pioneer story here : http://www.planetary.org/programs/projects/innovative_technologies/pioneer_anomaly/update_20090209.html