Teaching Journal: Day 3

In the real, physical class we learned to count yesterday. Elementary combinatorics, I mean. Figured out what “n!” and “n choose k” and the like meant and how that married with probability and produced a binomial distribution. Pure mechanics. (We begin Chapter 4 today.)

I trust, dear reader, if you don’t already know these things you can read over the class notes to learn, or can find one of hundreds of internet sites which have this sort of information. It is of some interest and we will later use the binomial for this and that. If my Latex renderer is working, you should be able to see this formula for the binomial (if not, even Wikipedia gets this one right):

     Binomial = {n\choose k} p^k (1-p)^{n-k}

Now the thing of interest for us is that we must have some evidence, or list of premises, or propositions taken “for the sake of argument”, that we assume are true and which state, E = “There is some X which can and must take one of two states, a success or a failure, and the chance X is a success is always p; plus, X can be a success anywhere from k = 0, 1, 2, …, n times.” Then, given this E, the formula above gives the probability that in n attempts we see k successes.

An example of an X can be X = “A side shows 6.” Now given the evidence (we have seen many times before) Ed = “There is a six-sided object to be tossed, just one side is labeled ‘6’, and just one side can show” we know that

     Pr(X | Ed) = 1/6 = p.

Notice that if n = k = 1, then the binomial formula just reproduces p.

Here is where it gets tricky, and where mistakes are made. Notice that Ed is evidence we assume is true. Whether it really is true (with respect to any other external evidence) is immaterial, irrelevant. Also notice—and here is the juice; pay attention—Ed says nothing about a real, physical die. We are still in the realm of pure logic. And logic is just a study of the relation between propositions: it is silent on the nature of the propositions themselves.

So for instance, let Em = “Just one-sixth of all Martians wear a hat” and let Y = “The next Martian to pass by wears a hat.” Thus

     Pr(Y | Em) = 1/6 = p.

For the next n Martians that pass by, we could calculate the probability that k = 0, or k = 1, or … k = n of them wear a hat. Even though, of course, given our observational evidence that there are no Martians, no Martian will ever pass by.

(If you are sweating over this, remove “Just one-sixth of” from Em; Pr(Y|Em) = 1 and then we can still us the binomial to calculate the probability that the next k of n Martians wears a hat.)

Probability, like all logical statements, are measures of information, and information between propositions. The propositions do not have to represent real, physical objects. Pick up any book of introductory logic to convince yourself of this.

Where people go wrong in statistics in not starting with the reminder that probability is logical, a branch of logic. Thus they confuse Ed with saying something about real dice. They ask questions like, “How do we know the die isn’t weighted? How do we know how it’s tossed? How much spin is imparted? What kind of surface is the die tossed onto? What about the gravitational field into which the die is tossed? Is there a strong breeze?”

All of those (and many, many more) are excellent questions to ask about real dice, but all of them are absolutely irrelevant to Ed and to our absolute, deduced knowledge that Pr(X | Ed) = 1/6.

It is only later, after we learn the formal rules of probability, some basic mechanics, but more importantly after we have fully assimilated the interpretation of probability, do we invert things and ask questions like, “Given that we have seen so many real-life tosses of this real-life die, in this certain real-life situation, what is the probability we will see a 6 on the very next roll?”


Make sure you see the difference between that last question and the one above using the binomial using Ed or Em.

Read over Chapter 2 and be sure you understand the four most basic rules of probability (the mechanics stuff).

Correct any typos in this post.


  1. MarkF

    It seems to me, when not dealing with certainty (Pr =0 or Pr = 1) in a properly deduced logical conclusion regarding probability, sometimes additional premises are assumed. For instance, let Y = “The next Martian to pass by wears a hat”, let Em = “Just one-sixth of all Martians wears a hat”, and let En = “Martians that pass by do so randomly.”

    This would yield: Pr(Y | Em and En) = 1/6. Otherwise, isn’t randomness assumed in the statement: Pr(Y | Em) = 1/6?

    It is often preferable not to write out additional assumptions or to reword statements, but unless it is done, we may not all be reasoning from the same starting point. You could be inviting people to invent their own unstated premises (assumptions) in the non-universal cases as is seen from the responses to previous chapters.

    By the way, I prefer the current version of your website to the recently considered new version.

  2. JH

    … then we can still us the binomial to calculate the probability that the next k of n Martians wears a hat.

    A box contains 18 Martians, of which 3 wears a hat. (Everyone knows that Martians are as big as a golf ball.) The probability that the next 2 of 3 Martians drawn from the box wears a hat depends on whether you draw with or without replacement. If it’s drawn sequentially as described in this post, the probability can’t be calculated using a binomial formula.

    Have fun thinking about the differences among binomial, geometric, hyper-geometric, and negative binomial distributions!

  3. DAV


    I believe the point is that Pr(Y | Ed) = 1/6 is a statement about our level of knowledge. “Randomness” doesn’t apply. “Randomness” is not an inherent property. It’s just the name assigned to causes beyond our current ken. “Pick one at random” means do so in a way that won’t of itself influence the outcome.

    I agree with the Pr(Y | Ed) = 1/6 only if the “six-sided object” is, in fact, a gaming die as we know them or at least a cube. If not, or unknown, the only reasonable Pr(Y | Ed) is “?” as most “six-sided objects” in this world are shaped more or less like books instead of dice (e.g., of course, books but there are cell phones, cars, laptops, buttered sliced bread, etc.) any number of which assigning 1/6 to our level of knowledge of which side will show in the result of a toss is unreasonable. I notice that Matt slid “die” and “dice” into the explanation. He should have started with them instead of “six-sided object”.

  4. Uncle Mike

    Plato did not play dice with the universe.

    Ideal forms, ya know.

  5. In Chapter 2 Problem(4) you ask:
    Let your evidence be E = “This is a coin, only one side of which is an H.”
    Suppose you toss this coin and see that it turns up heads 40 times out of
    50. You want to toss is one more time. What is your estimate of Pr(H|E)
    and why?

    Now I am guessing that you want the answer 1/2 here because the fact, F, of the 40Heads in 50 tosses is not part of the “E” in Pr(H|E). But I want to know what you would say about Pr(H|EF). Would your answer to this not depend on having some notion of what a “coin” is – and in particular of how likely it is to be perfectly balanced? and would not the requisite information bring an aspect of relativism or dare I say *subjectivity* into the analysis?

    I don’t know if there is any non-trivial domain of discourse in which your idea of “logical probability” can be proved to make sense, but my doubts are greatly increased if you want to assign “probabilities” to statements that involve terms from common language.

  6. dearieme

    typo. us -> use

  7. Tom

    >>Pick up any book of introductory logic to convince yourself of this.

    “If you can’t explain it simply, you don’t understand it well enough.” ~ Albert Einstein

Leave a Reply

Your email address will not be published. Required fields are marked *