Before you is a box in which is a slip of paper on which is written either ‘0’, ‘1’, ‘2’, or ‘3’. Given that premise, what is the probability of X = “‘3’ is written”?
Right: it’s 1/4.
Notice, incidentally, the probability does not tell us how that number came to be written on the paper, nor why the paper is there, nor how you will “draw” the paper from the box, information which is any case is irrelevant. All that we deduce from the premise is that writing exists and you are partially but not wholly ignorant of what it is.
Different set up. In a new box (or even the same!) are three marbles, each either white or black. Given that premise, what is the probability of X = “The number of white marbles is 0 (or 1, or 2, or 3)”? First note that we deduce there may be no white marbles, or no black ones, or any combination of the two, as long as there are 3 in total.
The probability may not be as obvious, and indeed the formal mathematical proof begins with the hypergeometric “distribution”, noting the logical equivalence of constants, and carrying all this forward. You can take my word—or Laplace’s, a more eminent authority, and the man who first derived it—that the calculations produce a probability of 1/4 for 0 white marbles (and 1/4 for each of 1, 2, or 3), which might be in line with your intuition, as suggested by the first example.
Again notice that there isn’t any word about “drawing” the marbles out, how the marbles got their color, or anything else to do with causes. Though some thing or things must have caused the color and number of the marbles, but we know it not.
A third set up. In a new new box are three marbles, each either white or black, and we’re going to draw out three of them. Given that premise, what is the probability of X = “The number of white marbles is 0 (or 1, or 2, or 3)”?
Let’s enumerate. Given this premise, we could see any of the following sequences, and number of white marbles:
- W1W2W3, 3
- W1W2B3, 2
- W1B2W3, 2
- B1W2W3, 2
- W1B2B3, 1
- B1W2B3, 1
- B1B2W3, 1
- B1B2B3, 0
This indicates the probability (given the premises) of 0 whites is 1/8, of 1 white is 3/8, of 2 whites is 3/8, and of 3 whites is 1/8.
Something has changed. The second example, conditional on very similar premises to the third, gave a probability of 1/4 for each possibility, while the third example gives varying answers. What gives? “Paradox!” answer some. Howson and Urbach, in their influential Scientific Reasoning: The Bayesian Approach, argue from the apparent paradox to a justification of subjective probability (p. 59-62 in the second edition; pdf first edition). Besides being wrong, this is defeatist. “We don’t know what the probability should be so we can make up whatever feels best” can scarcely be a satisfactory answer. (Though it is to all “subjective Bayesians”.)
Still, it’s odd. Similar premises give completely different answers. To see what’s happening, let’s change the premise in the first example so that the slip of paper has the marking “W1W2W3“, or the marking “W1W2B3“, or, etc. Given that premise, what is the probability of the marking “B1W2B3“? Easy: 1/8—and it’s the same for any marking. The probability would also be 1/8 were the premise that the paper had the marking ‘1’, ‘2’, etc., ‘8’.
So what makes writing the labels 0 through 3 the same (in probability) as “In a new box are three marbles, each either white or black”? And why are they dissimilar to “In a new new box are three marbles, each either white or black, and we’re going to draw out three of them”?
Cause; rather, our knowledge of causes. All we require of the markings in the first example is that they be distinct. All? Well, not quite all. We also require that the label, whatever it is, be written in advance, by some (unknown, unspecified) cause. That cause fixes the labels or balls in advance. Our knowledge of cause in these cases is extremely limited: we only know that a cause, or causes, must have been present, and we know the outcome.
But there is no way to think of drawing out marbles without envisioning some kind of drawing-out cause. If you are practiced at simulation, this will make sense—and all of us are practiced at simulation. Think coin flips. There is no way to imagine, or rather to manufacture, a string of three flips, or three anythings with dichotomous outcomes, that does not make reference to physical causes.
Our understanding of causes in the two situations is a real, and huge, difference. At least for the very small. Once we get going and start taking observations, (it can be shown) the two views “collapse” to the same, especially for “large N” (many observations).
This is not the only system in which the measurement process dramatically changes our perspective, as the not inapt comparison to quantum mechanics reveals. Anything-we-know-not-what could have fixed the labels/constituents of the box, whereas not just any old thing could make a string of white/black (0/1, etc.) to emerge from a box. Far from being a paradox, the differences in probabilities highlight the importance of measurement and the knowledge which comes with it.
Conclusion? There is no problem with logical probability, a.k.a. probability as argument.