Please pass this on to ANY researcher who uses statistics. Pretty please. With sugar on top. Like I say below, it’s far far far far far past time to cease using statistics to “prove” cause. Statistical methods are killing science. Notice the CAPITALIZED words in the title to show how SERIOUS I am.
Statistical models cannot discern or judge cause, but everybody who uses statistics thinks models can. I prove—where by prove I mean prove—this inability in my new book Uncertainty: The Soul of Modeling, Probability & Statistics, but here I hope to demonstrate it to you (or intrigue you) in short form using an example from the paper “Emotional Judges and Unlucky Juveniles” by Ozkan Eren and Naci Mocan at the National Bureau of Economic Research.
Now everybody—rightfully—giggles when show plots of spurious correlations, like those shown at Tyler Virgen’s site. A favorite is per capita cheese consumption and the number of people who died by becoming tangled in their bedsheets. Two entangled lines on a deadly increase! Perhaps even more worrying is the positive correlation between the number of letters in the winning words at the Scripps National Spelling Bee and the number of people killed by venomous spiders. V-e-n-o-m-o-u-s: 8.
All of Virgen’s silly graphs are “statistically significant”; i.e. they evince wee p-values. Therefore, if statistical models show cause, or point to “links”, then all of his graphs must—as in must—warn of real phenomenon. Read that again. And again.
All of Virgen’s correlations would give large Bayes Factors. Therefore, if Bayesian statistical methods show cause, or point to “links”, then all of these graphs must—I must insist on must—prove real links or actual cause.
All of Virgen’s data would even make high-probability predictions, using the kind of predictive statistical methods I recommend, or using any “machine learning” algorithm. Therefore, if predictive or “machine learning” methods show cause, or point to “links”, then all of his graphs must—pay attention to must—prove cause or show real links.
I insist upon: must. If any kind of probability models shows cause or highlights links, then any “significant” finding must prove cause or links. Any data fed into a model which shows significance (or large Bayes factor, or high-probability prediction) must be identifying real causes.
Since that conclusion is true given the premises, and since the conclusion is absurd, there must be something wrong with the premises. And what is wrong is the assumption probability models can identify cause.
There is no way to know using only sets data and a probability model if any cause is present. If you disagree, then you must insist that every one of Virgen’s examples are true causes, previously unknown to science.
Here’s an even better example. Two pieces of data, Q and W, are given and Q modeled on W, or vice versa, give statistical joy, i.e. these two mystery data give wee p-values, large Bayes factors, high-probability predictions. Every statistician, even without knowing what Q and W are must say Q causes W, or vice versa, or Q is “linked” to W, or vice versa. Do you see? Do you see? If not, hang on.
How do we know Virgen’s examples are absurd? They pass every statistical test that say they are not. Just as the flood, the tsunami, the coronal mass ejection of papers coming out of academia pass every statistical test. There is no difference, in statistical handling, between Virgen’s examples and official certified scientific research. What gives?
Nothing. Nothing gives. It is nothing more than I have been saying: probability models cannot identify cause.
We know Virgen’s examples are absurd because knowledge of cause isn’t statistical. Knowledge of cause, and knowledge of lack of cause, is outside statistics. Knowledge of cause (and its lack) comes from identifying the nature or essence and powers of the elements under consideration. What nature and so on are, I don’t here have space to explain. But you come equipped with a vague notion, which is good enough here. The book Uncertainty goes into this at length.
You know the last paragraph is true because if presented with the statistical “significance” of Q and W no statistician or researcher would say there was cause until they knew what Q and W were.
The ability to tell a story about observed correlations is not enough to prove cause. We could easily invent a story about per capita cheese consumption and bedsheet death. We know this correlation isn’t a cause because know the nature of cheese consumption, and we have some idea of the powers needed to strangle somebody with a sheet, and that the twain never meet. Much more than a story is needed.
Also, if we know Q causes W, or vice versa, or that Q is in W’s causal path, or vice versa, then it doesn’t matter what any statistical test says: Q still causes W, etc.
We’re finally at the paper. From the abstract:
Employing the universe of juvenile court decisions in a U.S. state between 1996 and 2012, we analyze the effects of emotional shocks associated with unexpected outcomes of football games played by a prominent college team in the state…We find that unexpected losses increase disposition (sentence) lengths assigned by judges during the week following the game. Unexpected wins, or losses that were expected to be close contests ex-ante, have no impact. The effects of these emotional shocks are asymmetrically borne by black defendants.
You read it right. Somehow all judges, whether they watch or care about college football games and point-spreads, let the outcomes of the football games, which they might not have watched or cared about, influence their sentencing, with women and children suffering most. Wait. No. Blacks suffered most. Never mind.
Wee p-values “confirmed” the causative effect or association, the authors claimed.
But it’s asinine. The researchers fed a bunch of weird data into an algorithm, got out wee p-values, and then told a long (57 pages!), complicated story, which convinced them (but not us) that they have found a cause.
What happened here happens everywhere and everywhen. It’s far far far far far past the time to dump classical statistics into the scrap heap of bad philosophy.
I beg the pardon of the alert reader who pointed me to this article. I forgot who sent it to me.