Last week of teaching!
Randomness is not a cause. Neither is chance. It is always a mistake to say things like “explainable by chance”, “random change”, “the differences are random”, “unlikely to be due to chance”, “due to chance”, “sampling error”, and so forth. Mutations in biology are said to be “random”; quantum events are called “random”; variables are “random”. An entire theory in statistics is built around the erroneous idea that chance is a cause. This theory has resulted in much grief, as we shall see.
Flip a coin. Many things caused that coin to come up heads or tails. The initial impetus, the strength of the gravitational field, the amount of spin, and so on. If we knew these causes in advance, we could deduce—predict with certainty—the outcome. This isn’t in the least controversial. We know these causes exists; yet because we might not know them for this flip does not imbue the coin with any magical properties. The state of our mind does not effect the coin in any, say, psychokinetic sense.
Pick up a pencil and let it go mid air. What happened? It fell, because why? Because of gravity, we say, a cause with which we are all familiar. But the earth’s gravity isn’t the only force operating on the pencil; just the predominant one. We don’t consider the pencil falling to be “random” because we know the nature or essence of the cause and deduce the consequences. We need to speak more of what makes a causal versus probabilistic model, but a man standing in the middle of a field flipping a coin is thinking more probabilistically than the man dropping a pencil. Probabilities become substitutes for knowledge of causes, they do not become causes themselves.
The language of statistical “hypothesis testing” (in either its frequentist or Bayesian flavor with posteriors or Bayes factors) is very often used in a causal sense even though this is not the intent of those theories. We must acknowledge that the vast majority of users of models of uncertainty think of them in causal terms, mistakenly attributing causes to variously ad hoc hypotheses or to “chance.”
Suppose the user of a model of income has input race into that model, which occurs in two flavors, J and K. The “null” hypothesis will be incorrectly stated as “there is no difference” between the races. We know this is false because if there were no difference between the races, we could not be able to discern the race of any individual. But maybe the user means “no difference in income” between the races. This is also likely false, because any measurement will almost surely show differences: the measured incomes of those of race J will not identically match the measured incomes of those of race K. Likewise, non-trivial functions of the income, like mean or median, between the races will also differ.
If the observed differences are small, in a sense to be explained in a moment, the “null” has been failed to be rejected; it is never accepted. Why this curious and baffling language is used is because of Popper’s notions of falsifiability, which we have discussed before. For now, all we need know is that small (but actual differences) in income will cause the “null” to be accepted. (Nobody really thinks in terms of failing to reject, despite what the theory says.) When the “null” is accepted it is repeated that there is “no” difference between the races, or that any differences we do see are “due to”, i.e. caused by, chance.
But chance isn’t a cause. Chance isn’t a thing. There is no chance present in physical objects: it cannot be extracted nor measured. It cannot be created; it cannot be destroyed. It isn’t an entity. The only possible meaning “due to chance” or “caused by chance” could have is magical, where the exact definition is allowed to vary from person to person, depending on their fancy.
Some thing or things caused each person measured to have the income he did. Race could have been one of these causes. An employer might have looked at an employee and said to himself, “This employee is of race K, therefore I shall increase his salary 3%.” Or he might not have said it, but did it anyway, unthinkingly. Race here is a partial cause. This kind of partial cause might have happened to some, none, or all of the people measured. If the researcher is truly interested in this partial cause, then he would be better served to interview whoever it is that assigns salaries and so discover the causes in each case. Assuming nobody lies or misremembers and can bring themselves to proper introspection, this is the only way to assign causes. But that is time consuming and expensive. And researchers have anyway been falsely taught that if certain statistical thresholds are crossed, causality is present. This fallacy is the cause of the harm spoken of above.
Even if the null is not rejected it is still possible that some or even all of the people measured had salaries in part assigned because of their race. There isn’t any way to tell looking only at the measured incomes and races. If the null is accepted, no person, it is believed, could have had their incomes caused partially by their race. Again, there isn’t any way to tell by looking only at the data. But when the null is accepted, almost all researchers will say that causality due to race is absent—replaced, impossibly, by chance. The truth is we have no idea and can have no idea, looking just at measured race and income why anybody got the salaries they did.
When the null is accepted, but the researcher had rather not accept it, perhaps because his hypothesis was consonant with his well being or it was friendly to some pre-conception, he immediately reaches to factors outside the measured data. “Well, I accepted the null, but you have to consider this was a population of new hires.” That may be the case, but since that evidence did not form part of the premises of the model, it is irrelevant if we want to judge the situation based on the output of the model. I have much more to say on this when discussing models. It is anyway obvious, that, to his credit, the researcher is looking for causes. Even if he gets them wrong, that is always the goal.