On Cause In Probability Models

On Cause In Probability Models

Note This post originally ran 11 June 2019 under the title “When Cause Is Not A Cause — Another Anti-P-Value Argument”. The post itself is extracted and modified from this set of papers, which should be read in full. I’m reposting it because the interest in cause in probability models is ramping up, whic his good news.

It cannot be denied that cause can be known, and that some causal X can be put into uncertainty models. Indeed, when this happens, which it does especially in those models which are frequently tested against reality, it is discovered testing (and p-values) “verifies” these causes. Yet they were known, or suspected, before testing. And indeed must have been, as we shall see. How cause is known is discussed presently. We must first understand cause is not simple. We remind the reader of a simple example. Consider an experiment which measures exposure or not to some dread thing and whether or not people developed some malady. Hypothesis testing might “link” the exposure to the disease; and, indeed, everybody would act as if they believed the exposure has caused the malady. But the exact opposite can be true.

It will almost surely be the case that not everybody in the exposed group will have developed the malady; and it will also be that some people in the not-exposed group will also have the malady. It thus cannot be that the people in not-exposed group had their disease caused by the exposure, for of course they were not exposed. It then necessarily follows that their malady was caused by something other than the exposure. This, then, is definitive proof that at least one more cause than the supposed cause of the exposure exists. There is no uncertainty in this judgment.

We still do not know if the exposure was a cause. It could be that every person in the exposed group had their disease caused by whatever caused the disease in the not-exposed group—or there could even be other causes that did not affect anybody in the not-exposed group but that, somehow, caused disease in the exposed group. It could be that exposure caused some disease, but there is no way to tell, without outside assumptions, how many more maladies (besides the known other cause(s)) were caused by the exposure.

It’s worse still for those who hold uncertainty models can discover cause. For how do we explain those people in either group who did not develop the disease? Even if exposure causes disease sometimes, and the other (unknown-but-not-exposure) cause which we know exists only causes disease sometimes, we still do not know why the exposure or non-exposure causes disease only sometimes. Why did these people develop the malady and these not? We don’t know. We can “link” various correlations as “cause blockers” or “mitigators”, but we’re right back where we started from. We don’t know, from the data alone, what is a cause and what is not, and what blocks these (at least) two causes sometimes but not in others.

Cause therefore, like probability, is known conditionally. In our most rigorous attempts at discovering cause, we design an experiment to demonstrate a causal connection or effect: X is believed to cause a change in Y. Every known or believed cause or moderator of Y or of X is controlled, to the best of the experimenter’s ability. Then X is deployed or varied and the change in Y is measured as precisely as possible. If after X changes we see Y change, or we see a change of a certain magnitude and the like, we say X is indeed a cause of Y (in some form).

But this judgment supposes we have correctly identified every possible cause or moderator of Y or of X. Since in science we deal with the contingent, Y will be contingent. This means that even in spite of our certainty that X is a or is the cause of Y, there always exists the possibility (however remote) that something unknown was responsible for the observations, or that blocked or prevented X from using its powers to change Y.

This possibility is theoretical, not necessarily practical. In practice we always limit the number of possible causes X to a finite set. If we design an experiment to deduce if X = “gravity” caused the Y = “pencil to drop”, we naturally assume, given our background knowledge about gravity and its effects on things like pencils, that the pencil will drop because of gravity. Yet if the pencil does drop, it remains a theoretical possibility that something else, perhaps a mysterious pencil-pulling ray activated by pranksters, caused the pencil drop and not gravity. There is no way to prove this is not so. Yet we implicitly (and rightly!) condition our judgment on the non-existence of this ray and on any other freakish cause. It is that (proper) conditioning which is key to our understanding of cause.

This discussion might seem irrelevant, but it is not. It in fact contains the seed of the proof that algorithms automated to discover cause of observables must contain in advance at least the true cause of Y. And so it must be that cause, or rather knowledge of cause, is something that can only be had in the mind. No algorithm can determine what was a cause or was not, unless that algorithm was first “told” which are causes and which not. This is proved shortly (in the paper linked at the beginning).

The point of this exercise is to exhort researchers from preaching with too much vigor and too much certainty about their results, especially in those instances where a researcher claims a model has backed up his claims. All models only do what they were told; that a model fits data is not independent verification of the model’s truth.

A true cause X of Y, given identical conditions and where identical is used in its strictest sense, will always lead to perfect correlations (not necessarily linear, of course) of an observable. An algorithm can certainly note this perfect correlation, and it can be “told” to say things like “If a perfect correlation is seen at least so-many times, a cause exists.”

But perfect correlations are not always indicative of cause. Samples, even though thought large, can be small, and the correlation spurious. The direction of causality can be backwards, where it’s not X causing Y, but Y causing X. Third, and of even greater importance, removed measures might be causing the observations: e.g. W is causing V and Z which are in turn causing Y and X, which is all we measure. These kinds of remote-cause scenarios multiply. In the last, if W is not measured, or V or Z are not, then the algorithm has no way to identify cause. If they are measured and in the algorithm, it must be because the cause was already suspected or known.

To support this site using credit card or PayPal click here


  1. Joy

    P values are as useful as ‘ small pox. ‘
    There is no such thing as statistical significance.
    Random means unknown
    Cause is not in the data.
    statistics can’t prove cause
    There is no need to speak of hypothesis testing
    and lots more.
    Thee rayne in spayne staiys mainly intha’ playne.
    The reign in speign stairs merynly in the pleryne.

  2. Joy

    I think I mean the null hypothesis.
    but I’ve never had one.

  3. Per

    I guess being charitable, we could wish just a small pox on wee pee values.

  4. Let those with eyes, see.
    Let those with ears, hear.
    Let those with minds, understand.

  5. Sheri

    Now we need “on proof with probability models”, which really is not possible but don’t tell Al Gore.

    Do not send this to a global warming advocate. That’s just mean!

  6. A point I see is that using models, even with their very-well known imperfections, are much more defendable than ‘eyeballing’ or ‘just look’ type of “analyses”. (which are often subjective, and often don’t consider more than sample statistics which change trial to trial)


Leave a Reply

Your email address will not be published. Required fields are marked *