Class - Applied Statistics

Lovely Example of Statistics Gone Bad

The graph above (biggified version here) was touted by Simon Kuestenmacher (who posts many beautiful maps). He said “This plot shows the objects that were found to be ‘the most distant object’ by astronomers over time. With ever improving tools we can peak further and further into space.”

The original is from Reddit’s r/dataisbeautiful, a forum where I am happy to say many of the commenter’s noticed the graph’s many flaws.

Don’t click over and don’t read below. Take a minute to first stare at the pic and see if you can see its problems.

Don’t cheat…

Try it first…

Problem #1: The Deadly Sin of Reification! The mortal sin of statistics. The blue line did not happen. The gray envelope did not happen. What happened where those teeny tiny too small block dots, dots which fade into obscurity next to the majesty of the statistical model. Reality is lost, reality is replaced. The model becomes realer than reality.

You cannot help but be drawn to the continuous sweet blue line, with its guardian gray helpers, and think to yourself “What smooth progress!” The black dots become a distraction, an impediment, even. They soon disappear.

Problem #1 one leads to Rule #1: If you want to show what happened, show what happened. The model did not happen. Reality happened. Show reality. Don’t show the model.

It’s not that models should never be examined. Of course they should. We want good model fits over past data. But since good models fits over past data are trivial to obtain—they are even more readily available than student excuses for missing homework—showing your audience the model fit when you want to show them what happened misses the point.

Of course, it’s well to separately show model fit when you want to honestly admit to model flaws. That leads to—

Problem #2: Probability Leakage! What’s the y-axis? “Distance of furthest object (parsecs).” Now I ask you: can the distance of the furthest object in parsecs be less than 0? No, sir, it cannot. But does both the blue line and gray guardian drop well below 0? Yes, sir, they do. And does that imply the impossible happened? Yes: yes, it does.

The model has given real and substantial probability to events which could not have happened. The model is a bust, a tremendous failure. The model stinks and should be tossed.

Probability leakage is when a model gives positive probability to events we know are impossible. It is more common than you think. Much more common. Why? Because people choose the parametric over the predictive, when they should choose predictive over parametric. They show the plus-or-minus uncertainty in some who-cares model parameters and do not show, or even calculate, the uncertainty in the actual observable.

I suspect that’s the case here, too. The gray guardians are, I think, the uncertainty in the parameter of the model, perhaps some sort of smoother or spline fit. They do not show the uncertainty in the actual distance. I suspect this because the gray guardian shrinks to near nothing at the end of the graph. But, of course, there must still be some healthy uncertainty in the model distant objects astronomers will find.

Parametric uncertainty, and indeed even parameter estimation, are largely of no value to man or beast. Problem #2 leads to Rule #2: You made a model to talk about uncertainty in some observable, so talk about uncertainty in the observable and not about some unobservable non-existent parameters inside your ad hoc model. That leads to—

Problem #3: We don’t know what will happen! The whole purpose of the model should have been to quantify uncertainty in the future. By (say) the year 2020, what is the most likely distance for the furthest object? And what uncertainty is there in that guess? We have no idea from this graph.

We should, too. Because every statistical model has an implicit predictive sense. It’s just that most people are so used to handling models in their past-fit parametric sense, that they always forget the reason the created the model in the first place. And that was because they were interested in the now-forgotten observable.

Problem #3 leads to Rule #3: always show predictions for observables never seen before (in any way). If that was done here, the gray guardians would take on an entirely different role. They would be “more vertical”—up-and-down bands centered on dots in future years. There is no uncertainty in the year, only in the value of most distant object. And we’d imagine that that uncertainty would grow as the year does. We also know that the low point of this uncertainty can never fall below the already known most distant object.

Conclusion: the graph is a dismal failure. But its failures are very, very, very common. See Uncertainty: The Soul of Probability, Modeling & Statistics for more of this type of analysis, including instruction on how to do it right.

Homework Find examples of time series graphs that commit at least one of these errors. Post a link to it below so that others can see.

7 replies »

  1. and the biggest sin of all. The one which Briggs will never admit as a statistician, that uncertainty can ever be quantified by numbers with any credibility at all that was not apparent from just looking at or knowing the data. Quantifying something which is not known is nonsense. It takes a statistician to indulge such nonsense. Just writing an expression in mathematical form and calling it a model does not make the thing more true or more scientific. Words and knowledge of experts, in technical fields in particular, suffice.

    Only cleansing letters and using numbers does nothing but changes the language of discourse. Which is often expedient.

  2. This may be the Homework assignment, but, when I saw the graphs, I thought, “What lovely Hockey Sticks”.

    Once you gave the “assignment” to examine the graphs from

    I simply looked at the data; I totally ignored the blue line and the grey envelope. (Didn’t have a clue as to what they might mean.)

    So I guess I proved your point.

    By the way, I love how the grey envelope is entirely below the 0 line between 1830 and 1930. (We also have a negative slope from the get go, so apparently people thought they saw something which they later didn’t.)

  3. Nice post.

    However, even after ignoring the blue line and the gray band, one can still conclude that the tremendous increase happened in the last 50 years. The trend is there and cannot be denied. No uncertainty there (now we have the equipment they would kill for in 1600). If we remove the flawed model, the raw data still show a huge increase, but not gradual (some more familiarity with Y-axis units wouldn’t hurt)

    But I think that there is another problem in the graph. Bait and switch type, if you will, and it has to do with the X-axis. The ‘observed’, until recently, meant optical, so the hockey stick shape is really showing that once radio telescope technology took off, so did the maximum observed distance. Basically, there is a hidden variable that wasn’t mentioned. That’s what I find as the biggest sin in the graph.
    Again, just looking at the raw data, it is undeniable that there is a tremendous amount of progress in the most distant observable objects. Model or no model. What this graph really shows is the accelerated progress in instrument precision.
    Thanks.

  4. Kalif

    Exactly, and you point to even more reasons to dismiss the “model”

    “The raw data still show a huge increase, but not gradual”

    How far DO we see right now, and how close are we to seeing to the theoretical “edge?” or “beginning” of the Universe? Unless our understanding of the Universe changes, the data will “flatline” and the “model” will itself will “flatline” proving it’s dead already. It has no predictive capability whatsoever.

  5. As Kalif notes, the meaning of the word “observed” changes dramatically from one side of the graph to the other, so that concatenating the two data sets as if they are one is itself misleading. Will the James Webb telescope that Lee mentions generate another discontinuity?

  6. Joy,
    I too am rather puzzled by the notion of quantifying uncertainty. What does that mean?
    I tend to think that probability is based, ultimately, on frequencies i.e the frequentist interpretation is more correct than others. For if we try to work out the notion of quantifying uncertainty, it would rely on expectation of occurrences of particular events (such as drawing a 6 in a dice-throw) when such events are repeated many times.

Leave a Reply

Your email address will not be published. Required fields are marked *