What is the so-called Principle Of Indifference? A semi-screwy, semi-right idea in probability. To cadge an example from David Stove, let T be any tautology (a truth), then
Pr(Bob is black | T) = Pr(Abe is black | T) = etc.
where the constants “Bob”, “Abe”, or whom or whatever, are equal not because, as the Principle Of Indifference would say, you are “indifferent” between them, and not because there is “no reason” to select between them, but because any constant swapped in must, on this evidence, have the same probability. There is no proof of this: it is axiomatic. It gives the same answers as the POI, but I like this positive way of stating things because it more easily avoids mistakes and misunderstandings.
The biggest error in figuring probabilities is forgetfulness. Unnecessary paradoxes arise when evidence that is tacit is shunned or mislaid. Since all probability is conditional, we have to keep in the fore the precise premises we’re using.
With that warning, some examples of supposed counterexamples to the POI given in the unpublished paper “A new theory of intrinsic probability” by Purdue’s Paul Draper. The paper was kindly provided to me by Justin Schieber. (I gather this paper makes the rounds; anyway, these examples are common.)
Incidentally, as always I follow a logical view of probability—which is not subjective Bayesianism nor frequentism. I’ll use single quotes to encapsulate propositions and double quotes for actual quotations.
No need to take these all at once. Read at your leisure.
Example 1: A Die
We have evidence E = ‘A six-sided object with sides labeled 1 through 6, which is tossed and which when tossed must show one of these sides.’ The probability of Q = ‘The die comes up 6’ given E is 1/6. In notation:
Pr(Q | E) = 1/6.
And this would evidently be the same if instead of Q we had Q1 = ‘The die comes up 1’ or Q2 = ‘The die comes up 2’ etc. via the logical argument used above or via the POI. Draper:
…a person not only has no more reason to believe any one of the six statements ‘the die comes up 1â€™, ‘the die comes up 2’, etc., than any other, but also no more reason to believe either of the statements ‘the die comes up 6’ and ‘the die does not come up 6’ than the other. In such a case, the principle of indifference implies that (relative to that person’s epistemic situation) the statement ‘the die comes up 6’ is equal in probability to each of the statements ‘the die comes up 1’, ‘the die comes up 2,’ etc. and also equal to the statement that the die does not come up 6. But that can’t be right. So the principle of indifference must be false.
Draper agrees with our first calculation but says that R = ‘The die does not come up 6’ has, conditional on something not quite E, probability of 1/2. But R, given E, is equivalent to ‘The die comes up 1 or 2 or 3 or 4 or 5′, and the probability of that given E is clearly not 1/2.
Draper forgot he had E. It appears, in calculating the probability of R, he was thinking of something like E’ = ‘Either a 6 comes up or it doesn’t’, but then Pr(Q | E’) does not equal 1/2 because E’ is a tautology; it is true no matter what, even if we are not dealing with dice. Pr(Q | E’) is thus the interval from 0 to 1, i.e. no fixed number. Of course, Draper may have had something other than E’ in mind, but what it was is a mystery since he didn’t make it explicit.
Assuming probabilities can be calculated conditional on tautologies is another common mistake. His compound mistake led him to mistakenly reject the POI in this case.
Example 2: Tile size
Our evidence E = ‘Square tiles are produced from a factory having sides anywhere from 1 to 3 inches and this is a tile.’ Draper asks, “How probable is it that the length of [this] tile’s side is between 1 and 2 inches?” And he answers, conditional on E, 1/2.
But we also know the surface area of our tile can be from 1 to 9 square inches. Thus it appears that the probability, given E, of Q = ‘A surface area of this tile of between 1 and 4.5 square inches” is 1/2. But the lengths corresponding to these surface areas are 1 and 2.12132 inches. A contradiction! and bad news for the POI.
Draper then proposes a solution to this paradox which—I’m guessing he didn’t understand this—re-invokes the POI, but after a twist which I could not support more fully—because I’ve insisted on it many times.
Tile lengths cannot be continuous. No matter what, physics will limit the dimensions to a set of discrete sizes. It doesn’t matter what these are, only that they exist. Draper supposes 1 inch increments, but that doesn’t matter. Let them be 1/64 inch or whatever you like, and even let them be unequal; i.e. some increments are larger and some smaller than others, whatever. It doesn’t even matter if the tiles are square, but for ease of explanation suppose they are.
For fun and with Draper, amend E so that the square tile lengths are punched out in increments of 1 inch, but still between 1 and 3 inches. That means the only possible sizes are
(1, 2, 3) inches,
which correspond to only these surface areas
(1, 4, 9) square inches.
Recall Draper’s original question: “How probable is it that the length of the tile’s side is between 1 and 2 inches?” Given our modified E, the answer is 0. No chance at all. Tiles can be made with 1 inch or 2 inch sides all right, but no tile can be made between these lengths.
So let’s form a new Q: ‘The length of the tile is less than or equal to 2 inches’, which has, given E and the POI, probability 2/3. We can then ask what is the probability, given E, or R = ‘The surface are is less than or equal to 4.5 square inches”, which is also 2/3. (We’d get the same answer for any number less than 9 but more than or equal to 4 square inches.)
Because why? Because each possible length is tied to every possible surface area. And because we invoked the POI, or rather its logical equivalent, just as Draper did without noticing it. And he probably didn’t notice it because he gave a new name to the POI which did the same service. About that new name, I say nothing more here. We have bigger problems.
Example 2 Extended: Infinite tile lengths
Draper and I solved the problem by changing it, which seems like cheating. But it isn’t, because the original problem can never be met in real life. No manufacturing process can ever make tiles with infinitely different lengths. But if you want infinite lengths, the trick is always to work things out discretely, and to only pass to the limit when searching for approximations to tough combinatoric problems. There just are no real life examples of infinitely graduated measurements. None. Zippo. Zilch. Nada.
That really is the full answer. It’s that passing to the limit which creates, incidentally, the parameters of the usual probability models (see this pdf example).
Example 3: Bags in balls
Our E is that there are three balls in a bag, each of which must be either white or black (that link immediately above works this problem out in nauseating mathematical detail). Draper asks, given E, “what is the probability that all three balls are black?” He says 1/8 via the route, “Consider the following eight statements: all three balls are black, the first two are black and the third white, the first two are white and the third black, etc. One can easily imagine having no more reason to believe any one of those eight statements than any other.”
But then, says Draper, there are also four possible ratios of black balls to total balls in the urn (i.e., 1, 2/3, 1/3, and 0)…[and] the principle of indifference implies that the probability of the urn containing three black balls is 1/4.” Contradiction!
As above, Draper forgets some of his evidence. One of the ratios is indeed 3 out of 3, and another is 2 out of 3. But there are three ways to get 2/3: B1B2W3, B1W2B3, W1B2B3. Likewise, there are three ways to get 1/3, and just one way to get 0/3, That makes 8 total ratios, only one of which contains all black balls; thus, conditional on the full evidence (and notice even Draper started by labeling the balls but then forgot), we’re back to 1/8.
To solve the last problem, Draper again used his renamed and morphed POI to get back to where he never should have left. He did that in support of what he calls “intrinsic probability”, which is the probability a proposition has in absence of “our” evidence. I don’t buy that, but that’s a subject for another day, particularly since I haven’t let Draper have a say.
Forget that for now. Before us is the Principle of Indifference, or rather its logical counterpart. It manfully stood up to the supposed counterexamples Draper presented, as indeed it has rebuffed all challenges I ever heard tell of. All finite discrete challenges, that is.
Nothing in the world wrong with infinity. Why, God Himself is infinite. But in mathematics it makes a difference how you approach it. Two people taking different paths will end up at Infinity, all right, but in vastly different neighborhoods. And both will claim that because he can’t see the other, the other isn’t there, thus somebody must be in error.
(There may be no better demonstration of this than the first three minutes of this video.)
Update: Computing tip
Notation can be a dangerous tool, leading easily to the Sin of Reification, which is when the notation becomes real and that which it represents is forgotten. But notation is also darn helpful. When computing some probabilities, it is best to use it. Write out Draper’s first counterexample in notation, keeping all our premises:
Pr( The die does not come up 6 | A six-sided object with sides labeled 1 through 6, which is tossed and which when tossed must show one of these sides)
Stated so plainly, it is (next to) impossible to make the mistake of thinking this probability is 1/2.
Update See Part II: There Is No Such Thing As Intrinsic Probability.