# Video

Announcement: In order to preserve sanity, I am not going to include any technical proofs in the Video lectures from now on. As important, indeed crucial, as these are, there are at most 10 guys in the world who would be interested, and 9 of them don’t even know they should be. Because everybody thinks they have probability already all figured out. Which we have been proving they do not. Proofs will still appear in the written lectures, and all questions about them will be answered.

Rumble

Bitchute (often a day or so behind, for whatever reason)

HOMEWORK: (A) How many TOTAL groups can you get from groups of 5 from the numbers 1 through 70 AND one group of 1 from the numbers 1 through 25? (B) Which gives a higher TOTAL number of groups? Adding 19 numbers to the first batch (so we have 1 through 99), or we keep the 70 but choose groups of 6 instead of 5?

# Lecture

Last week’s homework was recalling Pr(M_6|E) = 1/6, and we did Pr(M_6|E + “fair”) = 1/6 (a circular argument!). Now give us Pr(M_6|E + “unfair”) = ?

First, if “fair” means anything, it means this: Pr(M_6|E+ Pr(M_6|E)) = P(M_6|E) by definition, a circular arguemnt, or it means some impossibly perfect symmetry in the device/die and its workings. Did you watch the coin flip video? (Blog, Substack). Now what is Pr(M_6|E + “unfair”) = ?

There is no answer! It depends on what you bring to the word UNFAIR. There is no definite meaning. Each of you will havc a different tacit definition in mind, each of which changes the probability.

There are only two lessons in this entire course. Just two. Last week I said one, but I meant two. Because two is more than one. The lessons are:

1. ALL probability about the proposition of interest Y is conditional on assuming the evidence X, i.e. Pr(Y|X). Change the X, you change the probability. Change the minutest fraction of X that is relevant to Y, you change the probability. Add a data point, change the probability. Make a new emphasis, change the probability. ANY change necessitates a change in the probability (keeping in mind relevance).
2. The ontology is not the epistemology. Probability is not real. The uncertainty we have in a thing is not the thing itself, which has no uncertainty in itself! Forgetting this leads to the Deadly Sin of Reification and accounts for the large (non-DIE) errors in science.

Some will think “unfair” means Briggs must have weighted the die so that it always comes up 6. So the probability is 1/6. Others will recall I do amateur magic, and that I would have figured people would choose 6, so I did 1 instead. So the probability is 0. Still others will say they have no idea, except that one of the states must still obtain. We still assume E! So the probability is in [0,1] — with strict bounds. And on and on, depending on what you mean by “unfair”.

Change the evidence, change the probability!

Now we also learn to count. But we do a whirlwind tour of counting. Counting is one of the most complicated mathematical subjects that exist, and goes by the name combinatorics.

You are also welcome to download (free!) my book Breaking the Law of Averages, and read Chapter 3 on counting.

Briefly, though, we learn that the number of ways to arrange n things (where the order matters) is n!, which is n x (n-1) x (n-2) x … x 1. Because of calculus (gamma functions), 0! = 1. We learn that the number of ways to arrange k out of n is n!/k!, where the order still matters. And we learn that that way to choose groups of k from n, where the order does not matter is called “n choose k”, which equals n!/ [ k! x (n – k)! ].

We really only need these simple facts, and these three probability rules, and we can derive nearly all the probability we’ll ever need. That’s how much we have done, though it might not see like it.

1. Pr(AB|C) = Pr(A|BC)Pr(B|C) = Pr(B|AC)Pr(A|C). Bayes’s theorem.
2. Pr(A|B) + Pr(not-A|B) = 1.
3. Pr(A_i|B) = 1/n, if B says there are N states only one of which must obtain (and says nothing more).

That is it. From there we can really go to town, and do. Next time we start on Jaynes Chapter 3, which brings these rules together in an elegant way.

Below, is David Stove’s proof that allows us to get that third point.

This is an excerpt from Chapter 4 of Uncertainty.All the references have been removed.

Now Stove’s attempt, in my notation and somewhat shortened. The statistical syllogism is deduced from the symmetry of logical constants in this example. Given H = “Just two of Abe, Bob, and Charles are black”, the probability of B = “Abe is black”, relying on the statistical syllogism, is 2/3. Let T be any tautology, a necessary truth. Then $\Pr(\mbox{HB}|\mbox{T}) = \Pr(\mbox{H}|\mbox{T})\Pr(\mbox{B}|\mbox{TH})$. Rearranging, and because logically TH is equivalent to H, we have $\Pr(\mbox{B}|\mbox{H}) = \Pr(\mbox{HB}|\mbox{T}) / \Pr(\mbox{H}|\mbox{T})$.

H is logically equivalent to
$$B_1B_2B^c_3 \vee B_1B^c_2B_3 \vee B^c_1B_2B_3,$$
where $B_1$ = “Abe is black”, $B^c_3$ = “Charles is not black”, and so forth. And that means
$$\Pr(B_1|\mbox{H}) = \frac{\Pr((B_1B_2B^c_3 \vee B_1B^c_2B_3 \vee B^c_1B_2B_3)B_1|\mbox{T})}{\Pr(B_1B_2B^c_3 \vee B_1B^c_2B_3 \vee B^c_1B_2B_3|\mbox{T})}.$$
Distributing $B_1$ in the numerator gives
$$\Pr(B_1|\mbox{H}) = \frac{\Pr(B_1B_2B^c_3|\mbox{T}) + \Pr(B_1B^c_2B_3|\mbox{T})}{\Pr(B_1B_2B^c_3|\mbox{T}) + \Pr(B_1B^c_2B_3|\mbox{T}) + \Pr(B^c_1B_2B_3|\mbox{T})}.$$
because $B^c_1B_2B_3B_1 i$ is impossible. Here is Stove’s big move. He states

\label{sequi}
\Pr(B_1B_2B^c_3|\mbox{T}) = \Pr(B_1B^c_2B_3|\mbox{T}) = \Pr(B^c_1B_2B_3|\mbox{T});

but also

\label{sequi2}
0<\Pr(B_1B_2B^c_3|\mbox{T}) <1.

Thus because of the symmetry of individual constants, the statistical syllogism is deduced. The 2/3 probability follows from the labels, here the names, being “exchangeable” with respect to T.

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: \$WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.

1. TFBW

The link to Breaking the Law of Averages has been sabotaged by one of your enemies.

2. Briggs

TFBW,

Fixed. Thanks.

3. gareth

16:43: “0!=1” – eh?

n!=n*n-1…
if n=0 then n!=0

Comment: agree with your decision to make lectures more narrative and leave proofs for notes.

Do keep up the excellent and interesting work 🙂

4. Briggs

gareth,

If n = whatever, and k = 0,1,2,…, n. Then n – k = 0 at the end. And so we need 0!.

5. gareth

@ Briggs:

I thought that we must be at cross purposes or something – you said/wrote:

n! = n x (n-1) x (n-2) … 1 (which is my understanding of a factorial, n multiplied by each smaller positive integer, ending at 1).

But you then said 0! = 1

I thought: why is this? if n = 0 then any product = 0 too. Is 0! then *defined* as 1 rather than 0?

Refers self to Wikipedia (the free encyclopedia that anyone can edit)…

Which says: “The value of 0! is 1, according to the convention for an empty product” and “The notion of an empty product is useful for the same reason that the number zero and the empty set are useful: while they seem to represent quite uninteresting notions, their existence allows for a much shorter mathematical presentation of many subjects” and “It is by convention equal to the multiplicative identity”

So now I know – one learns something every day 🙂

6. JH

gareth,

A lazy answer: 0! = 1 by definition.

An answer based on your thought: (n-1)! = (n!)/n. So 0! = (1-1)! = (1!)/1 = 1.

There are more complex explanations as well.

7. JH

3. Pr(A_i|B) = 1/n, if B says there are N states only one of which must obtain (and says nothing more).

Lowercase n and uppercase N have special meanings in statistics and the classic approach to probability. Anyhow, Something is missing in this statement. Is this statement about the principle of insufficient reason, which is my preferred term (instead of the principle of indifference)?

8. Briggs

JH,

Who knew that math was case sensitive.