Causality: Guest Post by DAV

Today’s guest post, on a subject dear to us all, is by long-time reader (and now contributor) DAV.

While fencing in many of the Web’s blogs it’s not uncommon to encounter the parlay “Correlation is not necessarily Causation.” To some extent this is true. The statement is actually an admonition against inference of causality in two-variable studies.

There are many philosophical meanings associated with “cause.” Some possibly predate Aristotle. I’m going to use it to mean contributing factor in an event: it may be the only one. If someone pushes a button on a box and a ball drops out then you could say the person caused the ball to drop even though an internal mechanism is the actual “cause.”

If A and B are not correlated, then obviously neither causes the other. It’s only when they are linked that the question of cause arises. There are several possibilities: (1) A causes B, (2) B causes A, (3) they cause each other, (4) something else causes both, and lastly (5) the correlation is pure coincidence. The last does not apply to those who believe everything is linked.

So, how does one go about determining cause when A and B are correlated?

Well, for one, you could consider time. With the possible exception of global warming, no known cause has followed its effect. So if two variables are correlated, with B following A, B is eliminated as a cause for A. That still doesn’t mean A causes B.

The usual answer is: by experiment, which isn’t exactly a precise answer. What is actually meant is the use of data where confounding factors have been eliminated or minimized. An experiment with intervention to avoid the confounding factors is the most convenient way. But there are many times when experimentation is not possible for ethical or practical reasons. For example, it’s pretty hard to experiment in astronomy. It is also considered bad form to induce a disease or to deliberately let someone die. And don’t forget time factor considerations. Most organizations will not fund 50-year experiments.

So, think about it. The determination is made from the experimentally obtained data; not the experiment itself. What’s more, it’s the how A, B and the new data inter-correlate. The experiment was a just convenience. If the experimental data were discovered lying about instead, then using them would not produce a different answer. Astronomy and epidemiology take advantage of this.

Enter the world of Judea Pearl. He has a fascinating book, Causality: Models, Reasoning, and Inference. Most of his work is based upon graphical depictions of correlations called directed acyclic graphs (DAG’s). They consist of single or multiple source nodes and one or more child nodes. What’s important is that they have no loops. DAG’s are “directed” with arrows traveling from one node to another. The source node (arrow tail) is called the parent and the destination node (arrow head) is the child. In a causality DAG, the parent is considered to be the “cause” of the child. A DAG is also convenient for encoding marginal probabilities. Pearl’s work (and those of his frequent collaborator G. Rebane), are based upon conditional probabilities.

Let’s take our old friends A and B. They could be diagrammed A —> B, A <—> B, A <— B, A—B. The last is actually what we started with. Computationally, with only two variables, it’s irrelevant whether A causes B or vice versa. You get the same marginal answers either way.

Once you have at least a third variable (C) it is sometimes possible to establish cause. How? One possibility is if there are three, mutually correlated variables {A, B, C} and A and B are independent given C, then C is the cause of both A and B. This would be diagrammed as A <— C —> B. Other link directions, though not necessarily all, can be established using conditional independence relationships. Interestingly, if A and B cause each other then there is a likely hidden cause.

Pearl and Rebane developed the one of the first, if not the first, DAG recovery algorithms. Give it the data and it recovers (most of) the DAG structure. The algorithm and an improved successor are outlined in the book. A relatively easy read using little math. A must for the upcoming long winter nights.


  1. Ken

    Statistics or Philosophy?

    Determination of “cause” or “causality” [a given act’s contribution to an outcome] is, fundamentally, at some point a very subjective assessment. What ultimately matters is that all participants/readers understand the relevant factors, including jargon & it’s meaning, & the relative interrelationships among the contributing & situational factors.

    Parsing causality as presented here overlaps a bit with philosophy. The common case study being what killed a guy that jumped off abridge. The correct or incorrect answers include: a) the fall, b) the sudden stop on the frozen water….c) the “heartbreak” from getting dumped by his girlfriend…etc.

    All answers are correct, or incorrect, depending on what matters to a given audience. In such a case this will vary: the police will rule it as suicide (death due to jumping), the family to emotional abuse from his ex-tramp flame (broken heart), etc.

    This is where real world situations break from academic parsing — the answer that’s “correct” is determined by the context, if not totally to a very substantial degree. In the case of our jumper, he died as a result of injuries & blood loss sustained by an impact — not jumping, not falling, not from transient depression (“broken heart”). Thus an overly parsed academically correct (forensically correct) analysis very typically fails to provide a useful answer — which focuses on those things that one can control or influence.

    Focusing an answer in the standard case of blunt force trauma & blood loss detracts from the technically incorrect “cause” of the mere act of jumping — the latter which focuses one on implementing means of preventing recurrence (higher fences, for example).

    Such analyses, to be practical, need to guide useful decisions & outcomes. Very often they do not when they could.

    Ultimately, it comes down to communication skill.

  2. Briggs


    It is indeed an excellent book, and very readable. Cheap to pick up a used copy somewhere, so buy one if you can.

  3. DAV

    Briggs, Thanks!


    Good points. I think Pearl and others were more concerned with the DAG. In more complicated ones, the direction of the arrow makes a difference and can sometimes be determined using probabilities arguing against subjectivity).

    Things rarely happen because of a single event but instead are the end result of a chain of events. Everything along the path from start to finish is a cause if each causes the next. Take your jumper for instance. Suppose all you knew was that he fell from a great height. The probability that an unimpeded impact will kill him doesn’t change if he jumped instead of being pushed. Even so, the direction of cause–>effect flows from jumped or pushed to death. And if you could model something to lead to the jumping or pushing — well so much the bettter (maybe) 🙂

    But one thing is clear: dying from the impact in no way caused the preceding fall. Pearl, etc. are interested more in the direction of the arrow than the nodes it links.

  4. JH


    I think you probably would find structural equation models (SEM) interesting. Based on what you’ve described, DAGs seems similar to path diagrams used to represent SEM. The software LISREL ( will allow us to create a path diagram easily, of course, after the causal relations among variables are established.

    Guess what? I learned about SEM/LISREL from a music professor!

    Is this book really a must for the upcoming long winter nights? I trust you that there is little math, but do you mean it will solve my occasional insomnia problem due to winter blues? *_^

  5. George Steiner

    I am just a symple minded engineering type. Understanding the mechanism for a cause is better than arrows and nodes.

  6. DAV


    Tell me you don’t actually burn raw trees in your fireplace.


    Statisticians are a an impractical lot. One of the advantages of a DAG is the ability to compute complex useful probabilities such as P(Death | Slapped-by Girlfriend). The why of a cause is then not so important as knowing how it flows. If you’ve read anything else from the Perfesser you would know he thinks answering why is infinitely recursive. Where would you stop?

  7. George Steiner

    Mr. DAV, it is the how that matters not the why.

  8. Well done. Kudos DAV, Pearl, et al. My Xmas present to myself. Thank you and me.

    At some levels the issue of causality is clear and simple. You strike your thumb with the hammer, which causes pain and spurting of blood. No doubt about it.

    At some levels the issue is opaque: the butterfly in China causing a storm in Indiana is an example of remote causation, so remote as to be implausable and even absurd.

    The arrow of time is critical, but so are the links — the dominoes, so to speak. I demand solid dominoes and a direct pathway without side branches. Other people may be more flexible. They would blame the slapping girlfriend. I would not. If she pushed poor Romeo off the bridge, that would be a different matter.

    Causation and justice are thus intertwined.

  9. AusieDan

    This is a very useful post.
    I’ve ordered the book.

  10. JH

    DAV, I don’t actually burn raw trees in my fireplace.

  11. Mike B

    What is in this book that is actually NEW?

    I grow a bit weary of comp sci professors who throw cheap computing cycles at old logic/probability/statistical problems and then get grants and write books, sometimes without really advancing things much at all.

    Data Mining comes to mind, for instance.

  12. Charlie Martin

    Well, you just cost me $50.

    Mike, speaking as one of those computer scientists, I think you’re basically mistaken. For example, Markov systems predate cheap computation; it takes computation to find out the behavior of non-homogenous systems and extremely complex systems.

  13. TomVonk

    What’s important is that they have no loops.

    Hmm then it’s not worth much.
    When something is dynamical (e.g it changes with time) then in the real world it is full of loops.
    A star begins to collapse because of local space-time curvature, the collapse increases the density which increases the curvature which increases the density ….
    What’s cause of what ?

Leave a Reply

Your email address will not be published. Required fields are marked *