I’ve never seen the show, but I’ve heard that the protagonist on the X Files used to have a desk sign which read, “I want to believe.” That sentiment characterizes most buyers and users of predictive or explanatory algorithms, i.e. statistics.
Really, there is no scientific promise too large that it will not be at least hoped for. It doesn’t matter how many failures or unrealized dreams are met, the newest thing always excites. We saw something of this Tuesday with a new algorithm that claimed to be able to forecast the stock market two years in advance.
Experience suggests that this purported marvel, or any new gee-whiz algorithm, will liquefy the rationality centers of those people in charge of securing predictions (financial services, governments, marketers, academics, etc.). The authors of that paper now have a window of opportunity to cash in on their method.
Meanwhile, grumpy naysayers warning against enthusiasm will meet with a lesser fate.
A tale. I was dealing with a potential client who believed they discovered a means to increase their predictive accuracy to astonishing heights. R-squareds (a poor measure; don’t use them) northward of 95% were seen in tests! I was to automate the process.
Turns out they were smoothing two sets of time series which originally had no relationship to one another, and then correlating the smoothed series—which suddenly showed a remarkable correlation! Regular readers will know this is a huge no-no. Smoothing artificially boosts correlation and predictive accuracy. I tried showing the client how this works, proving it with several examples using made-up data that looked like theirs.
Funny thing is that I convinced a business type I was right, but he was overruled by their algorithm person. This algorithm person was basing his excitement on a sterling certified peer-reviewed paper by an academic from an institution of world renown. Regular readers know all about peer-reviewed papers from credentialed academics.
In what is now the theme of my career, I didn’t get the job.
Another anecdote, necessarily vague because I cannot betray any confidences. This didn’t happen to me, but to somebody I know. Major company wanted to understand how “social media”, specifically Twitter data, predicted their income. My colleague correctly noted how noisy this data is.
My colleague thus warned that, while something might be learned, whatever it would be wouldn’t be earthshaking. Certainly not much money should be spent on the idea. This advice was rejected and the major company sought bids from algorithm firms which could take on the job. One was found. I cannot tell you how much money was asked for or given, but if I did you would faint dead away. I can only say that whoever runs this algorithm company could easily find a position in government.
I offered to do the job for an order of magnitude less. My bid was rejected, but then I, like my colleague, cautioned that not much would come of the analysis.
The sequel? You already know what happened so there’s no point going into it.
If you want to set up business as a data scientist (the newfangled term by which statisticians are beginning to call themselves), the lesson is this: promise the moon and charge like you’re actually going there. Failure is rarely punished and never remembered.
Uncertainty is the same tough sell in science. The way to do statistics (or machine learning, or AI, or whatever) properly, like I’m always saying, is to use whatever model you have to predict new, never-before-seen data. If your model works, you’ll make good predictions. If not, not. Problem is, this method is necessarily less certain than the old ways of doing things.
Doing it the right way makes it look like you know a hell of a lot less. Fireworks are rare. Everybody hates this. Science is supposed to make us more, not less, certain!
It also does no good proving that if you get uncertainty right that, even though you will be less sure of yourself, you will and must make better decisions, and that better decisions mean greater rewards. The allure of certainty is too strong. People want easy answers and can’t abide the fogginess which attends uncertainty.
I have only given you two anecdotes, but they can be multiplied indefinitely (especially in academia). It’s thus rational to believe that nothing will ever change and that people will continue to be over-certain.
Update Breaking news! I have just developed a zero-point energy super computerized big data artificial intelligent learning prediculator. Investors should use my contact page and send me money.