The Hierarchy Of Models: From Causal (Best) To Statistical (Worst)


There is a hierarchy of models in the sense they offer insight into the thing modeled. The order of importance is: causal, deterministic, probabilistic, statistical. Most models use mixtures of these elements.

All models have this form: a set of premises, which include any number of facts, truths, supposeds, data, and such forth, and a proposition of interest, which is the thing being modeled conditional on those premises.

A classic—or perhaps better, as you’ll agree, classical—causal model “Socrates is mortal” given “All men are mortal and Socrates is a man.” The model predicts Socrates will die because of the nature of all men. It is man’s nature to die, and Socrates (and you, dear reader) are among the race of men. We know all men are mortal from the necessarily limited sample of observations of past men, and from the induction of these dead men to the entire race.

Causal models give insight into or make use of the nature, the universal essence, of the thing of interest. Causal models require universals; they also require induction because we know the validity of all universals, natures, essences, through a type of induction.

Deterministic models are common in mathematics and are usually stated in a form such that the proposition of interest is a function of premises, like this: y = f(x). The “x” is placeholder for any number of premises. An example of a functional form of a deterministic model is y = a + bx3, which shows there are three explicit premises, the “a”, “b”, and x3, and one implicit, which is the form or arrangement of these premises. This equation might give the numerical level of some thing as a function of a, b, and x. It says, “Given a, b, and x, y will certainly be at a + bx3.” The equation determines y, but doesn’t explain the essence of the cause.

Some causal models may be put in equation form, but not all deterministic models are also causal. The equation given applies to a black box with two readouts, a “y” and “x”, and a dial is discovered to change the “x”. The formula is induced based on rotating the dial and noting the values of y and x. Only in the weakest sense can we say we have discovered the essence of the machine: we don’t even know what the values imply. Interestingly (and obviously to mathematical readers), more than one equation can be found to fit the same data (premises), which is also proof we have not learned the nature of the machine.

Probabilistic models abound. Given “This is a two-state object and only one state of s1 or s2 may show at any time”, the probability “The object is in state s1” is 1/2. Note carefully that no such real object need exist; and neither must real objects exist for causal or deterministic models, as should be obvious.

There isn’t any understanding of essence or nature of this object in this probability model: we don’t know the workings. If we did, we’d have a deterministic or causal model. The probability is thus only a measure of our state of knowledge of the truth of the proposition and not of the essence of object. Probability models are silent on cause.

The last and least are statistical models. These are always ad hoc and conflate probability and decision or mistake probability with essence. Statistical models are a prominent cause of the vast amount of over-certainty which plagues science.

Statistical models purport to say that x causes y, or that x is “linked to” y, through the mechanism of hypothesis testing, via frequentist p-values or Bayesian Bayes factors, but though x may really be a cause of y, or x really may be linked to y in some essential way, the statistical judgment that these conditions are so is always a fallacy.

Hypothesis testing conflates decision with probability; nothing in any hypothesis test gives the desired probability “Given x, what is the probability of y”; instead, testing says, based on ad hoc criteria, x and y are mysteriously related (“linked”) or that x causes y. These inferences are never valid. The importance of this logical truth cannot be overstated. This why so many statistical models report false results. (A reminder that a logical argument can be invalid but still have a true conclusion; the conclusion is just true for other reasons than the stated argument.)

Lastly, statistical models purport to report “effect size”, which is a measure of the importance of x on y. This “effect size” always either false or an assertion given far too much confidence (I used this word in its plain-English sense). Effect sizes say something about a premise inside x (a parameter or parameters) and not x itself, hence they are always over-certain. This form of over-certainty is eliminated by moving to a probability model.

More about this topic in the must-read get-it-now don’t-do-another-analysis-without-reading-it Uncertainty: The Soul of Modeling, Probability & Statistics.


  1. Sheri

    Very politically incorrect: “you are among the race of men”. I think Star Trek tried to address this once…..Anyway, boldly go where no one dares go any more and call us all “men”! 🙂

    “Interestingly (and obviously to mathematical readers), more than one equation can be found to fit the same data (premises), which is also proof we have not learned the nature of the machine.” This is not well understood by most of society. (Maybe because there are so few mathematically literate people.)

    Waiting for delivery of your book, but will read as soon as possible! It is interesting that you are a statistician but are so dissatisfied with statistics. I understand in a way—everything you are saying is what I learned about statistics 40 years ago. It all seems to have changed now, taken over by the media and politics, allowed to flourish because so few seem to understand math and science and what changes have occurred.

  2. K.Kilty

    I had a conversation with people from the college of education about the dearth of women in engineering–actually it should have been about dearth of women in selected fields of engineering, like mechanical, because there seem to be plenty in civil.

    The conversation got ugly. My opponents said that the reason was a “chilly climate” for women. I asked for some proof. They responded with statistics. Then I asked how the “chilly climate” works–what is the mechanism. This caused the argument to escalate. I was too dense to understand, and was proof of the problem

    The problem seems to be that not mathematically-inclined folks have no capacity to separate measures from models of causes–since both often involve symbols, inscrutable operations, and graphs, they must be the same.

  3. Joy

    If women can do the work they will get the job. It’s always been the same.
    Engineers don’t get paid too well because they want to do the job. Just as air hostesses want to do the work so they will work for less.
    One girl from our school went to work for Rolls Royce in mechanical engineering. She was blind and had worse sight than me.
    The other example I know of is a famous name in camera technology.
    The team designing the essential workings of the world beating fastest video camera in the world were men. The mechanical engineer is a woman. There is banter all day long and it doesn’t bother her. Nor should it. She does a good job. Good enough for NASSA, top gear, TV slow motion advertising clips, military ballistic testing but not good enough to pass US patent lies, I mean laws.
    If a woman can do the job she gets the job. This situation has obtained for many years alone. No need to talk of glass ceilings or need for quoters. If a group of men don’t want to work with a woman then if she’s got any clue she wont’ want to work with them either. Same goes for disability.

  4. Oldavid

    As usual I just don’t see what all the fuss is about.

    Just what have “glass ceilings” or statistics to do with what the nature of men and women have to do with what they naturally do best?

  5. Mactoul.

    “We know all men are mortal from the necessarily limited sample of observations of past men, and from the induction of these dead men to the entire race.”

    And what is the nature of this “induction” that is yielding certainity from a finite sample?

  6. Alf

    This post is more useful than an entire statistics course.

  7. Another fantastic piece. I really appreciate the enlightenment– and your obeisance to the perpetual need to tie statistics to actual knowledge. So many are learning “stats” without ever learning the logic and reasoning to use them effectively.

    I’d love to get your book, but at $56 for an e-book, that’s just a bit steep for me.

Leave a Reply

Your email address will not be published. Required fields are marked *