A Partial Solution For The Replication Crisis In Economics

I was invited by friends at Banking University, in Ho Chi Minh City, to write a paper with the title of today’s post. That paper is here at this link, and free.

As regular readers know, I loathe, with a passion, academic publishing, and I wouldn’t have done this where it not an earnest request from friends, and that the paper could be read free by anybody.

I had a fight with the publisher. I wrote a fine Abstract, but they insisted it conform to a rigid, and nauseating, script. It had to have some eight sections, but I talked them down to four: Design/methodology/approach, Purpose, Findings, Originality/value. The last one is particularly asinine. It forces authors to lie and boast. No man can be his own judge, but they force you to be. I got away with saying about my paper’s value: “The author argues that this, or any solution, will never eliminate all errors.”

Worst was that the publisher had a separate site for publishing the paper than submitting it. Which of course didn’t work, and took several back-and-forths with “support” (all in India, of course).

And on this site was a huge picture of the back of a female, her arms outstretched toward the setting sun over a distant sea, with large letters BEGIN YOUR PUBLISHING JOURNEY.

If my laptop wasn’t chained down from the last incident, I would have thrown it in the driveway and run over it in the truck so that I couldn’t even accidentally see that again.

But this was for friends, and I promised. So I gutted it out. And now you have this fine work before you.

Knowing that most won’t make the extra click, some highlights of the Introduction follow. But, really, just click over. It’s all HTML and can be read by anybody (with very little math training).

Or download the pdf.


There is in economics, as there is in many other fields, a reproducibility crisis. Papers with results once thought sound, important, and even true are turning out to be unsound, unimportant, and even false, or at least not nearly as sure as thought.

Let us first briefly demonstrate the existence of the problem, which is by now fairly well known, review-proffered solutions, and then recommend our own partial solution, which begins with a review of our understanding of the philosophy of models, recognizing models are the lifeblood of economics – and of all of science. We end with a proposal to modify certain model efforts.

The replication crisis was recognized after several large efforts were made to reproduce well-known research. The efforts largely failed. Again, this is not just in one area, but anywhere models are used.

For example, Camerer et al. (2018) attempted to replicate 21 (in what most considered to be) important papers in the social sciences published in what many say are the best journals, Nature and Science.

In their replications, they used “sample sizes on average about five times higher than in the original studies.” This means they did better than the originals, which had on average much smaller sample sizes. Even so, they only found “a significant effect in the same direction as the original study for 13 (62%) studies, and the effect size of the replications is on average about 50% of the original effect size.”

In other words, only about half the papers were found to replicate, and only at about half the effect sizes. This is a poor showing. This would seem astonishing, except it was discovered that this was in no way remarkable.

Take the situation in psychology. Klein et al. (2018) did the same as Camerer for 28 prominent psychology papers. This was a big effort. The replications “comprised 15,305 participants from 36 countries and territories.”

Results: “Using the conventional criterion of statistical significance (p < .05), we found that 15 (54\%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding…Seven (25%) of the replications yielded effect sizes larger than the original ones, and 21 (75%) yielded effect sizes smaller than the original ones.”

Again, only about half the papers replicated, with effect sizes mostly smaller than the original. As before, the papers chosen were considered to be important.

In medicine, Ioannidis examined forty nine papers, all considered stellar efforts: each paper examined had garnered at least 1,000 citations each. Of the attempts at replication, “7 (16%) were contradicted by subsequent studies, 7 others (16%) had found effects that were stronger than those of subsequent studies, 20 (44%) were replicated, and 11 (24%) remained largely unchallenged.” The story is the same as before. Only a quarter of papers were left touched.

Richard Horton, editor of The Lancet, in 2015 announced that half of science is wrong. He said: “The case against science is straightforward: much of the scientific literature, perhaps half, may simply be untrue. Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, science has taken a turn towards darkness.”

Open MKT tracks replication studies in marketing. As of March 2023, only “5 out of 43 (11.6%) attempted replications in marketing were unambiguously successful.” They also note that “The Journal of Consumer Research (Consumer Behavior’s top journal) now stands at a 3.4% replication rate. So far, 29 replications have been attempted and one has succeeded.” Just 1 out of 29 is dismal.

By “unambiguously successful” they mean “a significant result that matched the form of the original (same direction, same shape of the interaction effect, etc.). It also indicates that no obvious confounds appeared in the protocol when replicated.”

Our concern here is economics. Alas, the outlook is just as bleak for this field. Camerer and others discovered that only just over half of eighteen famous experiments replicated. These were, they claimed, the best papers under examination, and not the greater mass of ordinary research. They “found a significant effect in the same direction as in the original study for 11 replications (61%); on average, the replicated effect size is 66% of the original. The replicability rate varies between 67% and 78% for four additional replicability indicators, including a prediction market measure of peer beliefs.”

It is easy to go on in this vein, the literature is by now quite large, and we do not pretend to have covered more than a small portion of it. Yet this is not necessary, because the theme is clear. There is an enormous problem with research. Error, falsities taken as scientific truths, and over-certainty abounds. The question are: why, and what can be done about it?

It must also be stressed that the half of science that is wrong, as Horton commented, and as all large replication efforts have confirmed, is the best science, or what is considered the best. Consider how bad it must be in the lower tiers of research, where work is far less prominent, less well checked, and which exists in far greater number.

We next discuss some of the reasons the crisis exists and why so much over-certain science, marketing, and economics research is produced.

That’s the intro. Go read the rest. I promise you will not develop a growth if you click the link. You want to know what the partial solution is, right? Right?

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email:, and please include yours so I know who to thank.

Categories: Statistics

7 replies »

  1. I replicate stuff using the copy function on my multi function printer / scanner.

  2. ”I promise you will not develop a growth if you click the link.”

    And what happens if I don’t click the link? I sprout a second head?

    Clicking, just in case…

  3. “…flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance…”

    That about covers the climate, mask/vaccine, race, and trans cults.

  4. In science replication is critical: if I claim that X+Y = Z for me, it’s utter nonsense unless it it works the same for you (and for the guys in Andromeda too ). In science, therefore, your ideas about making the model explicit/predictive should work.

    Economics may be dismal, but it is not a science – and, like all of the “behavioral sciences” its subject matter divides nicely into results reflecting the board sweep of human intellectual evolution and publishable
    stuff that doesn’t.

    For the latter we’re usually dealing with earth shattering conclusions drawn from both halves of a sample of three using data that cannot normally be replicated because heavily influenced by generally unknown factors affecting individual experiment participants at the time and place of their participation.

    So in the “behavioral sciences” replication is really a non-issue and forcing them to make predictive models will have no effect on the nature of quality of their work – notice, in this, that keynesian economics, like socialism, has never yet produced a policy success but remains beloved in the field, where Festinger’s theory always works, but is hated and mis-represented.

  5. Many years ago the late BD Mccullough demonstrated this very issue in the American Economic Review. Then editor Ben Bernanke then required that authors provide data and code. Unfortunately, mist othet journals did not follow suit. Mccullough did much of the pioneering work in replication in economics

  6. if known boundaries are not crossed there is no Novelty Search, and if instead boundaries are not known then there is no method (~replication~).

Leave a Reply

Your email address will not be published. Required fields are marked *