Reminder: The Thursday Class is only for those interested in studying uncertainty. I don’t expect all want to read these posts. So please don’t feel like you must. Yet, I have nowhere else to put them besides here. Your support makes this Class possible for those who need it. Thank you.
The search for formulae that transform time series into perfect predictions is fruitless, but the quest goes on.
Video
Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty
HOMEWORK: Find your own time series with trendy trend lines. See if they suffer from the Deadly Sin of reification.
Lecture
Review the last two Classed: That’s Not Noise, That’s Signal! and Trendy Trends In Time Series. We learned correlations aren’t causations, except rarely; no, not even if they are your correlations.
You already know nothing has a probability (if you are just entering the Class, see this). Just so, nothing has a correlation, which is another kind of name for probability. Thus, any method which promises to discover “the” or “true” probabilities, or correlations, must be flawed.
All classical procedures that use any kind of hypothesis testing, Bayes factors, or any (model) parameter-centric analysis which seek “true” values of probabilities, or that assume correlations are causations, are in fundamental error, and their use must cease, as we have demonstrated many times. But sometimes, as we also saw, the inherent flaws in these methods can be overcome by good experimental procedure—though it’s the methods that get the credit, and not the thinking of the scientist. Yet there is one area in which is the worst of both worlds, as it were. And that is time series.
Traditional time series is full of correlational alchemies that seek to transmogrify correlations into causations. And they believe they have done it. Or will. Time series mavens, most especially in AI, believe that mysteries lie hidden in accidental (and not per se) sequences of numbers, and that if only the right philosopher’s stone can be found, predictive perfection can be found.
This is a false hope.
We discussed the closing stock price of IBM as a common, and monetarily interesting, accidental time series. Each point (a price) in the series has innumerable mostly hidden causes and conditions, in long chains which forbid any method from discovering what they are. You will never know why—never know the full explanation, that is—the price at noon last Thursday was exactly what it was. You may know some of those reasons, but only some. And what you know is a pittance next to what can and must be known.
There will be no way to tease out from just the numbers in the time series itself its causes. Not for almost all time series. Exceptions, rare as tenured conservatives, are where the accidental series are physical, and where hopes of causal explanation are genuine. But notice very carefully that in these exceptions, knowledge of the causes comes from outside the time series themselves. The time series numbers are still not enough.
They will never be enough.
Yet because there is belief in probability as a real thing, there are endless formulae purporting to have discovered The Secret of Time Series. I include in this the endless technical analysis of stock prices, in which some (curiously not trillionaires) claim to have discovered secrets. There are acres of books, papers, code on ties series. Discussing even a fraction would take us weeks. We can’t afford the time. All we can do is give a flavor.
Notice above I said we might know some conditions or causes in accidental time series. Seasonality is a common such condition. Seasons themselves are not causes: seasons have causes. So even if there is, as there often is in environmental and economic data, clear sinusoidal patterns, we are still working with correlations. But of course when we see regularities of any kind, these ought to be captured by our models. And usually are.
In the same way, there is nothing in the world wrong with making use of correlations. If strong correlations exist, they can help predict future data—as long as we gird our mental loins against the Deadly Sin of Reification and avoid calling them causes (imagine a mental loin).
And there you have the two main techniques of time series: regularities and correlations. Everything else is only more or less useful, depending on the situation. For instance, various forms of regression are tried, adding in external measures (falsely called “control”) with the time series. These work in roughly the same way as ordinary regression (review). So I’ll pass them by here.
Regularities are easy as the seem. Conceptually, anyway. It doesn’t mean they are always easy to spot. There are various techniques, like Fourier and wavelet transforms that might help tease them out. Of course, since we are not dealing with causes, the regularity you thought was real could turn out to be a figment, then you have added noise to your model.
Time series models are, or anyway ought to be, judged the same way as all models are judged. By how well they predict. But we haven’t yet covered that topic, so leave it aside for now.
No accidental time series has a correlation. Correlations may be present in the data. You may or may not choose to use these, as some may be spurious. It is this point which sees the loudest disputes, because of the unfortunate belief correlations are real. “You didn’t use the correlation I favor in your model!” The answer to this always should be, “Get over it.”
Of course, strong correlations are usually useful, and your critic may help remind you of this, even though his motivation is in error. You can also ignore all of them, if it suits you. Drawing trends on time series, as we saw last week, does this.
At least, we have the fabled ARIMA model: Auto-Regressive Integrated Moving Average. The AR is correlations, the integrated possible trends, and the moving average also correlations of a different kind. From this model is built GARCH models: Generalized Autoregressive Conditional Heteroskedastic. The main difference from the ARIMA is that the variability of the series is assumed to change in time in a regular way.
Because of pure laziness, I steal math for the ARMA model from Wokepedia (all ARIMA models become ARMA models after differencing: see the video).
(There is no good reason to put the apostrophe on the p: ignore it.)
The alphas are for the main correlations of the data with itself at “lag i”, where i = 1, 2, …, p. The thetas are for the “noise” correlations, which is to say, the correlations left over after the main correlations are accounted for. These run from 1 to q.
You can see that there is nothing causal in this. It’s correlations from start to finish. GARCH models are similar, but add bits about changing variability. Regularities like seasonality and trends (the “Integrated” bit) are first removed before the ARMA or GARCH parts begin. Various flawed hypothesis testing methods tell you which p and q to pick. Yes, the dreaded P-value rears its ugly self once again. Though just as often other measures of model fit are used; things like AIC and BIC, both so-called “information criteria”. We’ll cover these in model goodness. None of these are definitive.
Testing, alas, is also found in the search for “stationarity.” This is a mythical state of the data in which it conforms to your model. Departures from model assumptions reveal a lack of “stationarity”. The idea comes from supposing the parameters, things like those alphas and thetas, are real and have “true” values. Which is false.
Nevertheless, you will see things like Dickey-Fuller (yes, really) tests, unit-root tests (for an obscure mathematical reason) and the like. None of these have any validity in discovering cause, and all arise from the false assumption the series is somehow itself part of the world. What it really means is that your correlational model is inadequate.
There is also another hypothesis test called a “Granger test”. Sometimes, sadly, “Granger test for causality.” This is used when you have more than one accidental time series and want to use the correlation between them to produce predictions superior to any single series. In other words, it’s like regression. But, as you will have guessed, there is no causality verified. The site Spurious Correlations has hundreds of multiple-series that pass Granger and other tests and which are obvious acausal, which is to say, spurious (when last I checked, there was a gorgeous wee P-verified correlation between Automotive Recalls and Google Searches for Cat Memes).
Please don’t use hypothesis testing. I’m asking nicely.
With all that said, since people have been at it for a few decades, a fairly good suite of tools to make reasonable predictions of well-behaved accidental time series. Those that are not as well behaved are not predicted as well.
Our last word of caution. The predictions ought to be—here, and in every model—couched in terms of observational uncertainty, which is to say probabilities of observables. “There is an 80% chance of rain”, “There is an 90% chance the price will be between a and b”, and the like.
What we do not want, but usually get, is parametric uncertainty, as we saw in the wrong way to do regression. This is a reduced form of the full uncertainty that speaks of unobservable, non-real model guts. That is because of the legacy of parameter-centric analysis, itself driven by the false belief that probability is real. Result? Massive over-certainty, the reproducibility crisis, and etc.
Now for some examples. Rather, not now. This is more than enough for one lecture. Next time we’ll do examples, all oriented to the points made today. We will not discover any occult method to produce perfect predictions. Because none exists.
Here are the various ways to support this work:
- Subscribe at Substack (paid or free)
- Cash App: $WilliamMBriggs
- Zelle: use email: matt@wmbriggs.com
- Buy me a coffee
- Paypal
- Other credit card subscription or single donations
- Hire me
- Subscribe at YouTube
- PASS POSTS ON TO OTHERS
Discover more from William M. Briggs
Subscribe to get the latest posts sent to your email.

