How Not To Think Like A Bayesian Rationalist

How Not To Think Like A Bayesian Rationalist

Lisping Rationalists

When I read people like Eliezer Yudkowsky and Scott Alexander it becomes clear to me how the French could have built a Temple of Reason during The Terror. Constructed, you recall, to hold practicing members of the Culte de la Raison.

Alexander started a recent article with what I thought, and what any sane person would think, was a rhetorical question: “Does it matter if COVID was a lab leak?”

Why, yes, Mr Alexander. It matters a great deal. For if hubris-filled Dr Frankenstein’s are unbinding Prometheus in gain-of-lethality experiments and wreaking global death we ought to know about it.

Which is what I thought he’d say. But he said this instead:

A good Bayesian should start out believing there’s some medium chance of a lab leak pandemic per decade. Then, if COVID was/wasn’t a lab leak, they should make the appropriate small update based on one extra data point. It probably won’t change very much!

I did fake Bayesian math with some plausible numbers, and found that if I started out believing there was a 20% per decade chance of a lab leak pandemic, then if COVID was proven to be a lab leak, I should update to 27.5%, and if COVID was proven not to be a lab leak, I should stay around 19-20%.

But if you would freak out and ban dangerous virology research at a 27.5%-per-decade chance of it causing a pandemic per decade, you should probably still freak out at a 19-20%-per-decade chance. So it doesn’t matter very much whether COVID was a lab leak or not.

I don’t entirely accept this argument – I think whether or not it was a lab leak matters in order to convince stupid people, who don’t know how to use probabilities and don’t believe anything can go wrong until it’s gone wrong before. But in a world without stupid people, no, it wouldn’t matter. Or it would matter only a tiny amount. You’d start with some prior about how likely lab leaks were – maybe 20% of pandemics – and then make the appropriate tiny update for having one extra data point.

This is gibberish, but couched in a technical science language, which results in a form of scientism.

As if one data point is not enough! The residents of Nagasaki circa 1945 can tell you all about the sufficiency of a mere one additional data point.

Rationalists love Bayes’s Theorem. They have words about how everything is, or should be, a matter of BT, and that if it was, how a paradise in thinking would arrive. Here’s how the Rationalists at Less Wrong define themselves:

The rationalist movement, rationality community, rationalsphere or rationalistsphere represents a set of modes of bayesian thinking from self-described rationalists or ‘aspiring rationalists’ typically associated with the Less Wrong diaspora and their associated communities.

Diaspora. Good grief, how well they think of themselves. Never mind. Let’s see how they treat Bayes:

Bayesians conceive rationality as a technical codeword used by cognitive scientists to mean “rational”. Bayesian probability theory is the math of epistemic rationality, Bayesian decision theory is the math of instrumental rationality. Right up there with cognitive bias as an absolutely fundamental concept on Less Wrong.

Let’s see why this is more wrong. There’s going to be some math below, but I don’t want your eyes to glaze over. I made it as easy as possible. Plow through it to get to the main points. The point is that math is not the point.

We’ll do the math first and return to the lab leak at the bottom.

Bayes Theorem

Bayes’s Theorem is simplicity itself. That is, the dry, textbook version of it is. Which is near useless in most real-life examples.

You begin with some proposition H, called a hypothesis, and conjure some probative information about it, so that you can form your prior. Most rationalists write this incorrectly as “Pr(H)”, which implies the hypothesis just “has” a probability. Nothing has a probability. Write “Pr(H|E)” instead, where E is the long string of complex probative information you assume about H. Then Pr(H|E) is the probability H is true given E. An entirely objective calculation based on your subjective assumptions. This is your uncertainty in H assuming E.

Next, “data” is collected or assumed, even just one point. Call it “D”. Then, through some mathematical manipulation, all perfectly true, you get this:

Pr(H|DE) = Pr(D|HE) x Pr(H|E) / Pr(D|E).

Some things to note about this equation. Pr(H|E) we know. “Pr(D|HE)” is called the data model (some say likelihood), and it depends on the assumptions you bring just as much as Pr(H|E) does—with the additional assumption that H is true. The denominator is the data model not making any assumptions about H.

There are nice technicalities about how to work this out. Yet even if you don’t follow any of this, if I told you “Pr(H|E) = 0.2”, and “Pr(D|HE) = 0.7”, and that “Pr(D|E) = 0.18”, then anybody (well, some bodies) can calculate “Pr(H|DE) = 0.78”.

That “Pr(H|DE)” is called the posterior, the probability of H after seeing or assuming D, and assuming E. It has changed from Pr(H|E) = 0.2; it has increased, because, in this fictional example, D has a better chance if H is true than if H is not.

This is the kind of calculation Alexander did, using H = “Lab leak”, and, like me, making up the other numbers. It’s Stats 101 stuff. The math is right. And useless.

Well, of almost no use.

This is not how people think of uncertainty, nor should it be in general. It is not wrong. It not even less wrong. It’s that this math is answering the wrong question.

Here’s how people really think about uncertainty, and should.

If you asked anybody (or most bodies, none of whom were employed in gain-of-lethality research) in the fall of 2019, “What’s the chance of a lab leak of a deadly Expert-manufactured virus?” They’d probably say, “How would I know?” Which is the right answer. The rational answer. And the point at which Bayesian rationalists first go wrong, because they are powerless because they can’t yet use BT! Pause and ponder this, because this will turn out to be the final answer.

Now most bodies in 2019 hadn’t any idea what the words “Lab leak” even meant, let alone “deadly Expert-manufactured virus”. These are vague, fuzzy terms. They do not lend themselves to making quantitative judgments about uncertainty. Besides that, what assumptions were most bodies about lab leaks supposed to make, even presuming they knew what the words meant?

On the other hand, it was not impossible to form some kind of guess. You could have sat down with a body and tried to explain what all the terms meant, yet since (then) we hadn’t seen coronadoom, none of us (not involved in the research) would have had a precise idea of what everything meant. Still, if pressed, some bodies might have been able to go as far as saying things like this:

Pr(Some kind of lab leak thingee | Some genetics and lab practices I just heard about) = “small, I guess.”

If you further pressed this body what “small” meant, you could have squeezed a number out of him, but these squeezings, I mean the assumptions that went into them, all go on the right hand side of the equation (to the right of the “|”).

All right, let’s have our “one” data point enter the picture. Just what is it?

Alexander couldn’t be bothered to specify his, instead just assigning it a number. But our “one” data point is not so simple as that. And it’s not the same for everybody, unless they all knew the exact same things about the claimed leak.

Here’s the kicker: the definitions of “lab leak” and the nature of assumed evidence, not just in the new data point, changed, and changed radically from late 2019 to today.

It’s a little known secret that you do not need Bayes’s Theorem at all. Not to quantify uncertainty. Sure, it makes things easier in many cases, and the math is dandy fine. But it’s not necessary. Here’s a mathematical example for the exceptionally curious.

I’ll try and explain it here without the math.

Real Uncertainty

What we always want, BT or no, are answers to questions like this:

Pr(H|E) = ?

Where H is our hypothesis (any proposition), and E our assumptions, data, beliefs, models, everything. Inside E are also the definitions of what H even means. E is all the evidence we have in favor of, or against, H. Everything we’re willing, or can, consider, at any rate. E is a complex proposition. So can be H.

Unless it’s textbook homework, like above where everything is specified or obvious, no two people have the same E for complex H. Which is why it’s rare for people to agree on the uncertainty of things. (I have two small papers on this recently, which I’ll highlight here at a later data.)

Here’s an illustration of what I mean. I don’t mean anybody ever writes things down like this, even professional researchers. I mean this is how people think, and that if researchers wanted to better grasp how people think they ought to write things like this.

We start with equations like this, which we can call our “prior”, if you like (take your time with this; the ‘&’ are logical ands.):

Pr(H|E) = Pr(H_1 & H_2 & H_3 & … & H_n(h) | E_h(1) & E_h(2) & E_h(3) & … & E_d(1) & E_d(2) & E_d(3) & … & E_1 & E_2 & E_3 & … & F_1 & F_2 & F_3 & …)

This is, of course, only a cartoon, but it is in the right form of how people think about uncertainty.

Each of those sub-propositions make up D and E. Some of them, the one with the subscripts ‘H’ and ‘D’, provide the definitions of the words used to form H and the model D. Definitions are always part of your assumptions. The subscripts ‘E’ provide the definition of the words in E! The ‘F’ are also E, but I want to separate them out for a reason I’ll tell you below. After all, “coronavirus” has to have a certain meaning, as does “leak”. We can’t get anywhere without having word, and grammar, definitions in mind. I cannot stress this strongly enough.

The remaining E are all those propositions we’re considering as probative (explain or explain away) H, even if the explanations are mere correlations. As you can see, even the hypothesis is a compound proposition (think precisely what “lab leak” can mean).

The thing is, even before “D” arrives, the E shift and morph, especially when H is complex. But, ignoring that and assuming all evidence is fixed before we see D, then we can at least given an impression of Pr(H|E), but maybe not a quantification.

Now “D” arrives”. D looks like this:

D = D_1 & D_2 & D_3 & …

So what we now want is

Pr(H|E) = Pr(H_1 & … & H_n(h) & H’_2 & H’_4 & … | D_1 & D_2 & D_3 & … & E_h(1) & E’_h(2) & E’_h(3) & E’_h(4) & … & E_d(1) & E_d(2) & E’_d(3) & … & E_1 & E_3 & … & G_1 & G_2 & G_3 & …)

What happened to the F? They were expunged and replaced by G! Because when we see the hideously complex D, we change our minds about some of the other evidence we first considered. Out it goes! And, perhaps, here comes some new evidence to replace it. The hypothesis has also changed, with some parts removed and other parts added (those primes indicated with single quotation marks), moves which necessitate changes in E.

I show just how all this works in the papers I mentioned, which are really only ideas borrowed from ET Jaynes (who, like us, was a logical and not subjective probabilist). All you need take away here is that both sides of the equation change, and not just because of the addition of new data D.

There is no one single fixed textbook set of hypothesis, data, model, prior and therefore posterior in these kinds of real-life questions. If there was, as in trivial and textbook examples, then everybody would always agree, or come closer to agreeing, when new data comes in.

That is not what happens in real life. People disagree, and often move farther apart when new data arises. And though many people are indeed irrational, and make any of a seeming infinite number of thinking mistakes, it is not irrationality that drives this conclusion. In fact—sneak peek!—it is Bayes’s Theorem itself!

The idea that rationality comes only with Bayes’s Theorem is false.

Lab Leak

I want to convey the whole idea of Alexander’s “one extra data point” and gross hand-waving with BT is ridiculous, but I don’t want to rehash the entire saga of how the lab leak hypothesis was refined.

It should be obvious enough the term “lab leak” itself morphed and changed from the days of that video of a Chinese woman eat bat soup to today’s discussion of furin cleavage sites and the like. Both the “H” and “E” have undergone radical surgery.

The point of all these changes is to get to the causes of “lab leak”, one way or the other, so that we know the full truth. Not that this is always possible. It’s not that you couldn’t use BT for every micro-step along the very long path of this argument—but if you did, H had to remained rigorously fixed, which of course it did not. That judgment follows directly from the math. It is “Pr(H|DE)” and not “Pr(H’|DE)”.

The evidence for a “lab leak” is long and complex. All this data can be considered as “one” data point, treating D as yet another complex proposition. But nobody serious should do this. Treating D as “one extra data point” leads to Alexander’s hand-waving, absurd conclusion. Because he doesn’t treat that “one extra data point” with any seriousness; he doesn’t consider rationally all the points of D; it’s only a tool that he can assign a ridiculous number to in his slavish use of BT.

We want to know if hubris-filled Dr Frankenstein’s are unbinding Prometheus in gain-of-lethality experiments and wreaking global death. All the evidence we’ve seen so far indicates they did so once. Will they do so again? Bayes Theorem that.

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.

9 Comments

  1. Leonard

    This may be subjective, but if you’ve ever worked in a biolab in the US you’d know how often you have to go through government inspections and how often the find “issues”. You’d also know the sheer volume of bio waste.

    Therefore your “educated guess” at pathogen escape probability would be 100%. To assume 20% is laughable.

    The whole pandemic response was a farce. The bureaucracy is set up to trust nobody – and then when the bureaucracy fails we are supposed to assume the bureaucrats can be trusted. Orwell might call that doubleplus doublethink.

  2. cdquarles

    Given that no one is perfect, over time, the probability of a lab leak approaches, if not reaches, one. “If something can go wrong, it will; and at the worst possible time”, so said a genius. I never worked in a biolab; but I have worked in pathology ones. I’ve also worked in wet chemistry labs. There is no chemical that man can make that the rest of nature can’t. Plants can’t run from predators, so they deal with predators chemically. Nature is full of poison, where poison is conditional: you have to know what, how much, and to whom, to say the least. Importantly, every successful embodied living thing alters its local environment to enhance its own survival. (Be fruitful and multiply, lest ye be replaced.)

  3. Mike Anderson

    Alexander might have been plausible had he started with a non-informatove prior.. BUT to start with some anally-extracted value renders all that follows absurd.. Never mind the fact that behavioral economics tells us there are few or no rational actors. “Rationalist” == “Bullshxtter”

  4. Johnno

    What’s the Bayesian probability that many such rationalists as these “die suddenly” throughout this one year?

    Will it be high enough to merit “concern”?

  5. Since in their universe there are no such things as truth and beauty this makes perfect sense to a Scientismist and it is logically consistent constrained as it is by the premises of Scientism. You’re just a mechanical automaton working thanks to blind chance so it all just happens by accident. You’re an epiphenomenon coincidentally along for the ride.

    The only way to deal with this Satanic lunacy is by the sword – both metaphorical and physical.

  6. Rob

    The probability of a lab leak is 1.

    It has happened in the past and will happen in the future. I am not talking about Covid here (although that is clearly 1 as well), but the past 3 or 4 smallpox outbreaks were lab leaks and the most recent foot and mouth outbreak in the UK was a lab leak.

    So, the issue is not the fact of a lab leak, but the consequences and given those (possible) consequences what measures should be taken. The only way to stop lab leaks is to not have a lab. But that won’t stop pandemics because there are other causes than lab leaks for pandemics.

    So what benefits do we get from a lab to mitigate upcoming pandemics such that the consequences of a lab leak are worth it? This is not a scientific question or possible to answer from any kind of probability analysis – it is a human societal question because “worth it” is a human value.

    Medical research has progressed a lot through research on the worst viruses, resulting in societal benefits, but how many dead people are “worth it”? Or maybe not just number of dead people, but years of life lost? Or quality adjusted life years (my personal favourite, but hey, who am I to judge)?

    The definition of “worth it” is not scientific therefore society has to debate what it means and – crucially – be given the opportunity debate it and the power to decide it. My concern as I get older is that this power has already gone and the opportunity is being seriously curtailed.

  7. What Alexander et al really say is that if you assume something might be true and then new evidence in favor of the hypothesis shows up, it can strengthen your belief. All the other stuff is numerical masturbation -akin to the calculation of climate change over the next hundred years to eight decimal places – based on the social clubideda that attaching numbers to beliefs proves rationality and is confirmatory. (And what Briggs says is that BS begets BS – if you can’t quantify all of your assumptions you can’t quantify any of the conclusions either.)

  8. Johnno

    We should hire an Expurt to count for us the probability of any marked increases of accidental gun discharges that may harm this same lot in the future following the next lab leak. All by pure chance, of course! Instruct the model to say that. We should call it ‘leaky pistol theory.’

Leave a Reply

Your email address will not be published. Required fields are marked *