Podcast Lecture #5: Four Probability Rules

Science Gone Wild with William M Briggs
Science Gone Wild with William M Briggs
Podcast Lecture #5: Four Probability Rules


Today’s Lecture:


I recently upgraded to Ubuntu 9.10, which included a new version of Audacity 1.3.9, a persnickety package that keeps locking up. Word on the ‘net is that it happens to the 64-bit version, but I have both a 32-bit and 64-bit machine and it locks up on both. I tried Ardour—a professional soundboard—but the thing is so complicated, that I haven’t figured it out in time to record this week’s lecture.

R is coming!

There have been requests for a series of podcasts on R. These are coming, probably in late January or February. R forms Chapter 5 of the class notes.

Conditional Probability

A recognition of logical probability is that all probability is conditional. Most books don’t come to defining “conditional probability” for many chapters, but it’s best to understand that all probability is with respect to, or given, or assuming that certain information is true. Think about the die example we have been using all along. The probability of seeing a 6 is conditional on our premises.

It was not a good habit for the old books to write probability such that it looks unconditional: this led to many misunderstandings. For a good example, grab any old book at look at the notation they use for gambling problems. If ever there was an argument for logical probability, it’s gambling. They will write, for example, the probability of dealing an “Ace” as Pr(Ace). They will give the premises separately: There is a deck of 52 cards, just 4 of which are labeled “Ace”; and just one card will be drawn. All that should have been listed in the evidence explicitly, to avoid confusion: pace Pr(Ace | E).


Good news! All philosophies of probability agree on the following rules. As long as the events, propositions, and statements in which we have an interest are discrete and finite. Discrete means non-continuous; we’re not going to (yet) use real numbers, but integers and rational fractions. And we are going to limit our objects of interest to be of a finite nature. This is acceptable because (so far) all events of interest to us are discrete and finite.

Probability Rule #1

Conditional on some evidence, if some thing of interest can be broken into parts, and one of those parts must be true, than the probability of each of those parts (each conditional on our evidence) sums to 1. Take our die example: it can be broken in to the parts “a 1 shows”, “a 2 shows”, … , “a 6 shows.” The probability that some number (from 1 – 6) shows is 1.

Another way to state this is “Either a 1 shows or a 2 shows or a 3 shows…” The probability of that statement, conditional on our standard die premises, is 1. Something must show!

Probability Rule #1: ORs turns to +’s.

Probability Rule #2

Conditional on some evidence, if we have two (or more) events (or propositions, etc.) that can occur, and knowledge of the first event is irrelevant to knowing anything about the second event, then the probability of the first event and the second event occurring is the probability of the first event (given the evidence) times the probability of the second event (given the evidence).

Take throwing two dice and our standard premises. What is the probability (conditional on that evidence) of seeing two 6s? Knowing the result of the first throw tells us nothing about the second. Knowing the result of the second throw tells use nothing about the first. That is, the knowing about one or the other events is irrelevant to knowing about the other. Then the probability of the statement “a 6 shows on the first throw and a 6 shows on the second” given our premises is the probability of a 6 on the first throw times the probability of a 6 on the second throw.

Probability Rule #2: ANDs turns to x’s.

The classical way to state Rule #2 is that the events are independent. I prefer Keyne’s term of irrelevant because it keeps everything on the terms of knowledge. This can be very important in some situations; mistakes in reasoning are easier to make classically.

Right click to download


  1. JJD

    Minor quibble on Probability Rule #1: OR of *non-overlapping* parts turns to +. The probability of “a multiple of 2 OR a multiple of 3 shows” is not 3/6 + 2/6. The requirement of not overlapping is also not made completely clear in the discussion leading up to Rule #1.

    On the other hand, you explained independence nicely for #2 as a precondition for multiplication of probabilities. It is very true that the term “irrelevant” is much more apt and intuitively understandable.

    Condolences on the havoc ensuing after an upgrade to Ubuntu 6.10. If you were to switch over to the current “stable” version of Debian, you would have a remarkably familiar-looking Linux environment in which almost everything actually works. Ubuntu is based on the bleeding-edge “unstable” Debian and lately has been a hobby horse for competition with Windows Vista user interface gimmicks in apparent disregard for actual usability.

  2. George

    Ubuntu 9.10 has a history of sound problems dating back over a year, often blamed on early adoption of pulseaudio, and possibly a poor integration of it. For me, the Ubuntu 9.04 and Ubuntu 9.10 upgrades both broke sound sharing across users… Commenting out a couple of lines in a config file in /etc/pulseaudio seemed to help on 9.04, but I haven’t had a chance to properly verify whether it helps with 9.10 yet.

    I find your position that it’s not necessary to assert that the dice are “random” interesting. Doesn’t it imply that the probability of rolling a pair of 6s should be considered greater than 1 in 36? If, starting with just “the dice have 6 faces each, numbered from 1 to 6”, but not assuming that P(6) is 1/6, you throw a die and it comes up 6, doesn’t that mean you ought to factor that result into the evidence for throwing the second die? After all, if all dice were biased to come up with 6 half the time, but you didn’t assume that (or neutrality, or any other bias), observing the first roll and assuming the second is more likely to match than not seems reasonable, and even Bayesian.

  3. DAV

    R? It will be interesting to hear you aurally demonstrate the use and intricacies of a written (i.e., visual) language or are you intending to only extol its wonders?

  4. Briggs

    DAV, Listen. And be amazed!


    I’m back to ALSA.

    Re: dice. Good questions, but not quite. We are still in the (logical) probability realm where we are computing probabilities not yet conditional on observation statements. That is, we are still theoretical and not realistic. We need to first understand the logic before we can see how actual data fits in.


    Right. The sub-events must be unique.

    I know. I have an older machine that I swear—one of these days!—that I’ll put Debian on. I had Fedora (naturally!) for quite a while. I used the Fedora machine as a server.

  5. michel

    Oh dear, when will they learn?

    You need to get over to Debian. The reason is, Ubuntu is new mastered every six months or so from Debian Experimental. No-one in their right mind uses Debian Experimental, let alone use a new version every six months. The Lord must love Ubuntu for this to work at all.

    Now, if you use Debian, you get a cohort of packages which is a distribution, and this cohort is making its way through phase reviews. So it starts out as Experimental, which is a warehouse not a distribution, it gets next to Unstable, which is a distribution but not one anyone should be using for work, then it becomes Testing. This is the first version you should consider using. And finally ever couple of years or so it is awarded the Good Housekeeping Seal of Approval, and it gets to be Stable.

    You should either run Stable, or you should run Testing if you want more up to date packages. But never run any version of Testing that is less than a year old.

    You might as well make the move. Because Debian is where everyone ends up. Its just how much grief they give themselves on the way there.

  6. Hi William,

    are you going to continue your explanation of your ‘I’ axiom which allows you to deduce P(a 6 shows) = 1/6? I’m still confused as to your justification for it.

    — Ralph

    p.s. I use Debian/Ubuntu at work every day. My machines at home run Vista. My home machines are *much* more pleasant to work with!

  7. PeterM

    I have Audacity version 1.3.4 (Ansi) version, and run it on TurboPup, a derivative of
    Barry Kauler’s puppylinux, which is optimised for P III machines. Don’t think I need a later version. I mainly use it if I want to apply the “compress dynamics” Nyquist plugin to a file.

    It’s as stable as a rock, though I do get console warnings about conversion of UTF-8 strings.

Leave a Reply

Your email address will not be published. Required fields are marked *