AI Is Kicking Statistics’s Ass

AI Is Kicking Statistics’s Ass

Here’s the headline: “AI can predict when someone will die with unsettling accuracy: This isn’t the first time experts have harnessed AI’s predictive power for healthcare.

Unsettling accuracy? Is accuracy unsettling? Has AI progressed so far that all you have to do is step on an AI scale and the AI computer spits out an unsettlingly accurate AI prediction of the AI end of your AI life? AI AI AI AI AI? AI!

I’ve said it many times, but the marketing firm hired by computer scientists has more than earned its money. Science fiction in its heyday had nothing on these guys. Neural nets! Why, those are universal approximators! Genetic algorithms! Genes in the machine. Machine learning! Deep learning! Like, that’s deep, man. Artificial intelligence! Better than the real thing!

What has statistics got? Statistically significant? No, that’s dead. Thank God. Uniformly most powerful test? Unbiased estimator? Auto-regressive? Dull isn’t in it. You won’t buy any headlines talking about mu-hat.

What’s the difference between statistics and AI? Besides the overblown hype, that is? One thing: a focus on the results. That’s the reason AI is landing every punch, and why statistics is reeling. Statistical models focus on fictional non-existent hidden unobservable parameters, whereas AI tries to get the model to make good predictions of reality.

Now AI is nothing but statistical modeling appended with a bunch of if-then statements. Only this, and nothing more. Computers do not know as we know; they do not grasp universals or understand cause. They don’t even know what inputs to ask for to predict the outputs of interest. We have to tell them. Just as we do in statistics.

The reason AI models beat statistical ones is because AI models are tuned to making good predictions, whereas statistical models are usually tuned to things like wee p-values or parameter estimates. Ladies and gentlemen, parameters are of no interest to man or beast. The focus on them has forced, in a way, a linearity culture, whereas if we can’t write down the model in pleasing parameterized form, we’re not interested. Besides, we need that form to do the limit math of statistics of estimators of these parameters so that we can get p-values, which do not mean what anybody thinks they do.

AI scoffs at parameters and says, how can I create a mathematical function, however complex, of these input measures so that skillful, but not over-fit, predictions of the output measure are good?

That, and its understanding, or its attempts at understanding, cause. We’ve discussed many times, and it’s still true, that you can’t get cause from a probability model. Cause is in the mind, not the data. We need to be part of the modeling process. And so on. AI, though it’s at the beginning of all this, tries to get this right. I’ll have a paper tomorrow on this. Stay tuned!

I say AI will never make it. Computers, being machines, aren’t intellects; they are not rational creatures like we are. Intellect is needed to extract universals from individual cases, and computers can never do that—unless we have first programmed them with the answer, of course.

That is to say, strong AI is not possible. Others disagree. To them I say, don’t wait up.

We can’t discount the over-blownness of the comparison. Reporters love AI, and nearly all cherish the brain-as-computer metaphor, so we’ll apt to see intellect where it is not. Plus hype sells. Who knew?

It’s not all hype, of course. AI is better, in general, at making predictions. But headlines like the one above are ridiculous.

When all the number crunching was done, the deep-learning algorithm delivered the most accurate predictions, correctly identifying 76 percent of subjects who died during the study period. By comparison, the random forest model correctly predicted about 64 percent of premature deaths, while the Cox model identified only about 44 percent.

These are not unsettling rates. The “deep learning” is AI, the “random forest” is “machine learning” (if you like, a technique invented by a statistician), and “Cox model” is regression, more or less. I didn’t look at the details of how the regression picked its variables, but if it’s anything like “stepwise”, the method was doomed.

We always have to be suspicious about the nature of the predictions, too. These should be on observations never before see or used in any way. They should not be part of a “validation set”, because everybody cheats and tweaks their models to do well on the validation set, which, as should be clear, turns the validation set into an extension of the training set.


  1. A choice of focus – process or results. In politics, conservatives claim to focus on results, but really focus on process. Liberals claim to focus on process, but really focus on results.

  2. Sheri

    “Science fiction in its heyday had nothing on these guys.” It has nothing on these guys now. There is little to no science and a whole lot of science fiction sold as science now. Science itself is in grave danger of mass extinction. Forget the “six great extinction”. Science is going the way of the dodo bird, due to the dodo’s running the science and the media marketing of the fiction.

    I’m skeptical on AI. First we’d need REAL intelligence and that’s in such short supply……

    As for predictions, I’ve yet to see any kind of “accuracy” more than a point or two above guessing. I doubt this “study” is any better and you’d need about 2000 more such studies before I’d even consider that AI was doing anything more than the gypsy fortune teller. Actually, observant people are often MORE accurate than an AI. Science just hates observant people…..

  3. DAV

    Neural nets are not AI and they are not merely curve fitters either. Stop thinking that way. The term AI has been usurped by marketing departments to mean “done by computer” — nothing to do with intelligence. NN’s can do things not easily accomplished by other methods such as those in statistics and even machine learning (like random trees).

    Some NN’s can learn features in data and really excel when the data are photos. The Chinese (not the only ones) have a project where they identify objects in surveillance camera views. Doing that is really hard (if not impossible) without using a NN.

    NN’s can also read handwriting. My bank can read handwritten deposit slips with a scanner. Another not-so-simple task. There is a database of handwritten digit samples (MNIST that is popularly used for teaching by NN university classes. Some researchers have claimed “near-human performance” on this database.

    Autoencoders (a type of NN) can be used to automate the search for data features. They actually learn a way to efficiently compress the data. The following is a bit dated but is the basis for the R autoencoder package:

    The algorithms used by Google and Facebook to determine content type are likely NN’s. Calls for releasing them so they can be examined for bias. Good luck trying to reverse engineer them.

    And yes, NN are trained for performance. A typical training set has training data, validation data (a small set used to detect under- and over-fitting), and test data (used to evaluate performance). No training is done with the test set by competent practitioners.

  4. Hoyos

    Personal off the wall thought? AI is the PERFECT vehicle for creating CS Lewis’ materialist magician. It will never get consciousness on its own but it can become complex enough to be manipulated by another.

  5. Gary

    … the marketing firm hired by computer scientists has more than earned its money. Science fiction in its heyday had nothing on these guys.

    Well, scifi in its heyday didn’t have the audience available today. Heinlein’s sentient computer, “Mike,” in his 1967 Hugo award-eligible-AND-winning novel, The Moon is a Harsh Mistress, was compelling as a real intelligence, but how many actually read the book? Later (1980-90s), sentient android Commander Data (with the oddly appealing personality), of Star Trek the Next Generation, approached human-like intelligence, but even the whole ST franchise then wasn’t as as large as the general public being fed AI “stories” in every media outlet today. Marketing AI is EZ when you write mile-wide and inch deep. Good scifi inverts the variables so it attracts fewer enthusiasts.

  6. Joy

    “As of August 2018, the best performance of a single convolutional neural network trained on MNIST training data using realtime data augmentation is 0.26 percent error rate.[16] Also, the Parallel Computing Centre (Khmelnitskiy, Ukraine) obtained an ensemble of only 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.[17][18]Incorrect labelling of the testing dataset may prevent reaching test error rates of 0%.”

    Inputs included human errors, which affected success rate. The system wouldn’t be arbiter of itself. Unless shown how to check, then it couldn’t innovate such a checking system, as it would not have such a thing as a motive for truth seeking (a will, call it heart) Which must include hope/belief in finding it. Separate to judgement, which might be just computation.
    It seems possible to write some form of default hope that would be called something else, like an assumption.

    The network cannot correct its error, ie know or recognise truth as yet. It’s not in the design. Nobody objects to biomimicry in arts or in architecture, in other aspects of engineering. It’s the projection about what will be claimed and that’s in the class of a fear. Fear sells.
    The human search for knowledge is contracted out to a neural network or a set of scales. They will give outputs. Where’s the problem?
    Scientific innovation entails the hope of discovering something useful. What’s with all the doubt?

    “…parameters are of no interest to man or beast. The focus on them has forced, in a way, a linearity culture, whereas if we can’t write down the model in pleasing parameterised form, we’re not interested…“
    Same kind of problem. Necessarily narrowing down the problem and framework to calculate some complex system’s operation. Then deciding whether the answer is correct! If it’s people, it’s just better to ask. Sometimes they tell the truth. Sometimes they tell you what they want you to ‘know’ or what they think you want to ‘know’. Clouds never tell lies and yet people can’t wort those out.

    On recognising there dots:
    20 in one cell, in one language, I counted. Once out of one cell (character space), the combinations start to multiply. Once the rules of recognition include more than one kind of language, even more.
    Once you know what is around those dots the recognition increases.

    Add movement or a third dimension and the possibilities increase. Without a knowledge of purpose, recognition on its own is only so much guessing or projecting. The network needs both to be mind like, or conscious.

  7. Brad Tittle

    I had an interview with a company that was using photo robots to track growth of flora in organized streams. One of the floras was medicinal herbs. They were using Machine Learning to try and help the robot (um… database) learn to predict how much herb was present at harvest. Once again, I got myself in trouble asking what I thought were softball questions.

    “How well does the ML process do compared to reality? ”

    “Well, they don’t tell us how much is actually getting harvested!”

    I didn’t get hired. (Probably because of a different question, not this one).

    But I happen to know someone who grows medicinal herbs. This person managed to grow enough before “legalization” to keep himself well fed. I asked him “How much herb do you get from your plants”. He was able to reply very quickly. “1 lb of dried herb for every 4ft X 4ft area”.

    or 1 oz / sq ft.

    Now. We have to recognize that his 4×4 area is heavily managed by him. Every square foot of that plant is optimized for herbal growth. A mechanized farm may not wish to spend as much energy optimizing plant area. If you have just a 4×4 ft area, you use every inch. If you can expand into the next room without worry, you expand into the next room and use electricity to overcome your decreased growth per square foot.

    There is a hint of an AI feedback blowback here.

    Herbal product margins are such that ‘recording’ exactly how much product is coming out may not be optimal to the fluid nature of the bottom line.

    Sometimes the fluidity of the bottom line is not subject the perfect analysis. Sometimes making things flow is a gram. Sometimes it is $. Can we make an AI that can read which way the wind blows on this?

  8. Alex

    The last paragraph is key. I tried making this point incessantly in grad school to no avail.

Leave a Reply

Your email address will not be published. Required fields are marked *