Bayes’s theorem, while nice, is not necessary. What counts is the relation between sets of propositions; between our assumptions and what we want to know. Bayes can even slow you down. WARNING for those reading the email version! The text below might appear to be gibberish. If so, it means the LaTeX did not render in the emails. I’m working on this. Meanwhile, please click on the headline and read the post on the site itself. Thank you.
Video
Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty
HOMEWORK: Given below; see end of lecture.
Lecture
This is an excerpt from Chapter 8 of Uncertainty.
Bayesian theory isn’t what most think. Most believe that it’s about “prior beliefs” and “updating” probabilities, or perhaps a way of encapsulating “feelings” quantitatively. The real innovation is something much more profound. And really, when it comes down to it, Bayes’s theorem isn’t even necessary for Bayesian theory. Here’s why.
Again, any probability is denoted by the schematic equation $\Pr(\mbox{Y}|\mbox{X})$, which is the probability the proposition Y is true given the premise X. As always, X may be compound, complex or simple. Bayes’s theorem looks like this:
$$\Pr(\mbox{Y}|\mbox{W}\mbox{X}) = \frac{\Pr(\mbox{W}|\mbox{YX})\Pr(\mbox{Y}|\mbox{X})}{\Pr(\mbox{W}|\mbox{X})}.
$$
We start knowing or accepting the premise X, then later assume or learn W, and are able to calculate, or “update”, the probability of Y given this new information. Bayes’s theorem is a way to compute $\Pr(\mbox{Y}|\mbox{W}\mbox{X})$. But it isn’t strictly needed. We could compute $\Pr(\mbox{Y}|\mbox{W}\mbox{X})$ directly from knowledge of W and X themselves. Sometimes the use of Bayes’s theorem can hinder.
An example. Given X = “This machine must take one of states S$_1$, S$_2$, or S$_3$” we want the probability Y =”The machine is in state S$_1$.” The answer is 1/3. We then learn W = “The machine is malfunctioning and cannot take state S$_3$”. The probability of Y given W and X is 1/2, as is trivial to see. Now find the result by applying Bayes’s theorem, the results of which must match. We know that $\Pr(\mbox{W}|\mbox{YX})/\Pr(\mbox{W}|\mbox{X}) = 3/2$ (because $\Pr(\mbox{Y}|\mbox{X}) = 1/3$). But it’s difficult at first to tell how this comes about. What exactly is $\Pr(\mbox{W}|\mbox{X})$, the probability the machine malfunctions such that it cannot take state S$_3$ given only the knowledge that it must take one of S$_1$, S$_2$, or S$_3$?
If we argue that if the machine is going to malfunction, given the premises we have (X), it is equally likely to be any of the three states, thus the probability is 1/3. Then $\Pr(\mbox{W}|\mbox{YX})$ must equal $1/2$, but why? Given we know the machine is in state S$_1$, and that it can take any of the three, the probability state S$_3$ is the malfunction is 1/2, because we know the malfunctioning state cannot be S$_1$, but can be S$_2$ or S$_3$. Using Bayes works, as it must, but in this case it added considerably to the burden of the calculation.
Most scientific, which is to say empirical, propositions start with the premise that they are contingent. This knowledge is usually left tacit; it rarely (or never) appears in equations. But it could: we could compute $\Pr(\mbox{Y}|\mbox{Y is contingent})$, which even is quantifiable (the open interval $(0,1)$). We then “update” this to $\Pr(\mbox{Y}|\mbox{X & Y is contingent})$, which is 1/3 as above (students should derive this). Bayes’s theorem is again not needed.
Of course, there are many instances in which Bayes facilitates. Without this tool we would be more than hard pressed to calculate many probabilities. But the point is the theorem can but doesn’t have to be invoked as a computational aide. The theorem is not the philosophy.
The real innovation in Bayesian philosophy, whether it is recognized or not, is the idea that any uncertain proposition can and must be assigned a probability. This dictum is not always assiduously followed; and the assignment need not be numerical. This is contrasted with frequentist theory which assigns probabilities to some unknown propositions while forbidding this assignment in others, and where the choice is ad hoc. Bayesian theory has two main flavors, subjective and objective. The subjective branch assigns probabilities based on emotions or “feelings”, a practice we earlier saw leads to absurdities. Objective theory tends to insist every probability can and should be quantified, which also leads to mistakes. We’ll find a third path by quantifying only that which can be quantified and by not making numbers up to satisfy our mathematical urges.
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: \$WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.