A hastily written entry today, folks. Busy day. Typos at no extra charge.
Reader Al Perrella invites us to Jerry Pournelle’s Chaos Manner, where a note from our friend Mike Flynn appears (get his books here) on correlation versus causation. A subject not unfamiliar to us (see this, this, and this among others).
Mike has an excellent example of correlation:
Columbia river salmon runs go up and down in roughly eleven year cycles. So do sunspots on the sun. Do sunspots cause salmon? Do salmon cause sunspots? Is there a lurking Z that makes salmon eager to spawn AND causes the sun to boil?
Any statistical analysis, Bayesian or frequentist, would call this data “significant”. A plot would show a coincidence of lines, one for the salmon, one for sunspots. Maybe somebody would figure that correlation was maximum when the peak of sunspots occurred two years before salmon numbers did, or whatever. P-values would be proffered. Theories would be created. Papers published.
But is it true that the quantify of salmon flesh influences sunspot number?
Well, the wee-ness of the p-value says nothing about the direction of the causation. Besides, if the data were ordered such that we thought sunspots caused salmon flesh, the statistical conclusions would be much the same, or even identical (depending on which classical test one used).
Point is, the ordering of Xs and Ys, as it were, must come from evidence outside the data. There is nothing in salmon numbers nor sunspots to tell us which is the cause and which the effect. But that’s okay, because in any statistical analysis there is no internal evidence which says, “Consider only this list of causes.” This is an inescapable limitation of probability.
Same thing in logic. I’ve used this example many times, but it’s still apt. If we’re interested in the proposition (or conclusion), “George wears a hat,” then we must have premises (evidence) which supply information about it. It is up to us and not logic to provide these premises (evidence).
If our (compound) premise is “All Martians wear a hat & George is a Martian” then we know, conditional on this evidence, that the probability George wears a hat is 1. George’s being a Martian causes him to wear a hat. There is a correlation between Martians and hat wearing. It is not a coincidence George wears a hat.
But if I, or you, change the premise to “Some Martians wear a hat & George is a Martian” then it becomes uncertain—unquantifiably so; see this—whether George wears a hat. The point being it is we who supply the evidence.
In regression, with some Y and list of Xs, it is we who say “Stop!” when delineating all the possible correlates of Y. It is we who compile the Xs. Did we get just the right ones? Well, sometimes, rarely, we know: say, in physics. But usually we do not know—where I use the word “know” to indicate certainty, a truth, and not a suspicion.
The reason we sometimes know in physics is because we have external evidence, a theory, which we accept as true (and which might not be, of course). If the theory is true—given we accept it as true, as we accepted either Martian premise—then we know just which Xs to collect.
Problem with data like salmons and sunspots is we have no theory we’re willing accept as true which provides us with the precise list of Xs and Ys we can use to infer causation. We’re probably willing to accept that salmon flesh does not cause sunspots—but understand that has not been proved true; it has merely been assumed true. That is the point. All premises are assumed true (there is a deeper sense in which some premises we say are “just true”, but that’s rare and a subject for another day; when I discuss miracles).
So we instead accept it might be true that sunspots causes salmon flesh. Or, as Mike suggested, we instead assume sunspots set some causal train in motion, the end result being a change in salmon flesh. Either set of premises we accept as true as we present our analysis. The data, however, say nothing about which if either sets of premises are true. Think about it: if the premises were known to be true from the data, then we’d have a circular argument. All we can infer from “George wears a hat” is the premise “George wears a hat”, which is true if we accept George wears a hat. Circularity.
The point Mike made, and which I hope I echoed properly, is fundamental. It is the difference between considering the salmon-sunspot signal a coincidence or part of a causal chain. It is a coincidence only if we assume there is no causal connection. We have not proved the lack of one by calling events coincidental.