How Naughty Researchers Learn To Imply Cause, Though Unproved, Has Been Discovered

Reading daily doses of linked to increases in manliness.

Forwarding posts from associated with general all-around magnificence.

Supporting or donating to increases world benevolence levels.

Now these are all instances of causal language: linked to, associated with, increases. We have seen these, and others like them, used endless times in peer-reviewed papers making the most preposterous and idiotic claims.

And we have warned you against such language, by proving the many ways that cause cannot be had from probability models, especially the fast-and-loose probability models often used in medicine.

Time and again you have heard our joke, Every scientist knows correlation is not causation, unless his P is wee, then his correlation becomes causation. As far as jokes go, it does not rate high on the hilarity scale. But what it says is true.

Finally we have some counting confirmation in the peer-reviewed paper “Causal and Associational Language in Observational Health Research: A systematic evaluation” by Noah Haber and a shotgun load of others in the American Journal of Epidemiology.

This team did the tedious, mind-numbing, but useful work of tallying abuses of causal language in medical papers.

Before we get to it, a small reminder that in statistics “significant” means “having a wee P”, and where “having a wee P” means “significant”: the statistical word “significant” has no connection with the English word of the same spelling and pronunciation. Our authors found that “significant(ly)” was the most used modifier word for results.

Abstract opening:

We estimated the degree to which language used in the high profile medical/public health/epidemiology literature implied causality using language linking exposures to outcomes and action recommendations; examined disconnects between language and recommendations; identified the most common linking phrases; and estimated how strongly linking phrases imply causality.

They then classified and weighted words by perceived causal association, which is surely error prone to some extent, but not by much, because it’s hard to argue with these words being causal. For instance, we see there was great agreement in raters in saying “consistent” and “correlated” had low causal-language strength, and that words like “cause” (of course) and “prevent” had high causal-language strength.

Their list, I think, most would agree with. Here’s the count:

I would have guessed “link(ed)” would be higher on the list, but “associate(d)” is just as weaselly. Say “death is associated with exposure” and you seem to have described a cause: exposure caused death. But, later, when it is shown that exposure and death were yet another spurious correlation, the researcher can say “We only said ‘associated.’ We didn’t claim cause.”

Unless the exposure is to something sexy, like third-hand smoke or MAGAism or “whiteness” or whatever, and the “research” generates a headline in some media source. Then the author, glad of being interviewed, will lather on causal words like, well, like a reporter showering praise on a celebrity.

Papers surveyed based on experiments used causal language more often than papers based on observations. This, of course, is not crazy, because the point of experiments is, or should be, to control at least some known causes. It’s rare, though, as regular readers know, that cause has been sufficiently proved in many papers.

In case you thought it was just me out on the lawn yelling at clouds, a clip from their Conclusion (my paragraphifications):

Our results suggest that “Schrödinger’s causal inference,” – where studies avoid stating (or even explicitly deny) an interest in estimating causal effects yet are otherwise embedded with causal intent, inference, implications, and recommendations – is common in the observational health literature.

While the relative paucity of explicit action recommendations might be seen as appropriate caution, it also invites causal inference since there are often no useful and/or obvious alternative (non-causal) interpretations. To our surprise, we found that the RCTs in our sample used similar linking words to the non-RCTs.

Our review suggests that the degree of causal interpretation for common linking words has been impacted by the unavailability of explicitly causal language, such that the meaning of traditionally non-causal words has broadened to include potentially stronger causal interpretations. It is likely that the rhetorical standard of “just say association” has meant that many researchers no longer fully believe that the word “association” just means association.

I hadn’t heard the lovely term “Schrödinger’s causal inference,” before. It comes from a 2021 note by Peter W G Tennant and Eleanor J Murray titled “The Quest for Timely Insights into COVID-19 Should not Come at the Cost of Scientific Rigor” in Epidemiology.

“Schrodinger’s inferences”, as above, are “where the authors caution against causal interpretations while themselves offering causal interpretations”.

Terrific. I’m going to use that.

Subscribe or donate to support this site and its wholly independent host using credit card click here. For Zelle, use my email:, and please include yours so I know who to thank.

Categories: Statistics

5 replies »

  1. IIRC, causal relationships require two things, a theoretical first-principles explanation of how A causes B and a controlled experiment (of the “fire laser at chemical” type, not the “answer this questionnaire for class credit” type), and are conditioned on the limits of the extant theories and the experimental conditions.

    That is then put in context of the literature (i.e. “we need to cite everyone who might referee the paper, cite the paper, or be in hiring, tenure, or promotion committees for the authors”) and supplemented with bombastic conclusions for visibility and never-to-be-fulfilled “future work” promises.

    Then a journalist ignores all the details and writes engagement-generating hype that has 50-50 chance of being related to the actual measurements in the paper.

  2. I’m intrigued by the word count methodology employed to assess the
    bombast or veracity of research. What we need are statistical tables of
    to weight each word for emotional linguistic appeal. I see years of research
    by statisticians and philologists to tease out the various attributes of each word.
    Years of conferences, lectures, and an endless stream of government funding
    which eventually will result in a ChatGPT like word weighting program that will put
    them all out of business. All interactions with humans and AI will be reduced to
    two terms OK and bullshit.

  3. “Associated” is associated with successful grant applications. Only associopaths who successfully associate the grantor’s pet cause with impactful headlines will associate with funding. Associates failing to associate will have their associations assassinated.

  4. You can force people to use certain words or phrases on a large scale. But you can’t force them to understand what those terms mean.

    Many statisticians will go off on people for saying “my p was not wee, therefore there was no causal relationship” because what they think you SHOULD be saying is “my p was not wee, therefore I will fail to reject the null hypothesis.” And yet many of those same statisticians in that scenario will say that since “the null hypothesis failed to be rejected” that the variable was merely “random”, or that there was “no association” relating to it or that there was “no meaningful effect” seen. That is, they will talk about things as though there was no causal relationship and the “null hypothesis had been proven” even if they fastidiously avoid those exact phrases. They have been trained not to utter cursed words, but have not been trained to think.

    Similarly there are many skilled statiscians who can give the correct definition of a p-value, confidence interval, etc. when pressed. So they will be able to say things like “the p value measures the probability that our test statistic equals or exceeds the observed value (in absolute values) given that data was generated from a given probability distribution.” But they will then immediately act as though the calculation did not start from an assumption that the data was randomly generated and instead used only the observed data. They have been trained to regurgitate the definition, but not to understand what it means.

    Same thing here. Those in the field have been trained to avoid being too confident in saying that they have found a definite causal relationship, since they have a vague idea that statistics is ill suited for proving this. So they talk about “associations” things being “linked” and so on, all while putting in the ritual statements that tell the reader to assume that a causal relationship has been proven. But those are just empty ritual words, so after that they are free to do what they wanted to do in the first place, i.e. say that their pet theory has been proven and you just have to deal with it. Hence Schrodinger’s Inferences.

  5. I’ve performed a small study and found that death is associated with life. The correlation is very high, therefore I conclude that life should be ended to avoid death.

Leave a Reply

Your email address will not be published. Required fields are marked *