(This is a modified excerpt from my forthcoming—he said hopefully—book, on the subject of why probability cannot be relative frequency. This is to be paired with the essay on why probability cannot be subjective. I particularly want to know if I have made this excruciatingly difficult subject understandable, and what parts don’t make sense to you.)
For frequentists, probability is defined to be the frequency with which an event happens in the limit of “experiments” where that event can happen; that is, given that you run a number of “experiments” that approach infinity, then the ratio of those experiments in which the event happens to the total number of experiments is defined to be the probability that the event will happen. This obviously cannot tell you what the probability is for your well-defined, possibly unique, event happening now, but can only give you probabilities in the limit, after an infinite amount of time has elapsed for all those experiments to take place. Frequentists obviously never speak about propositions of unique events, because in that theory there can be no unique events.
There is a confusion here that can be readily fixed. Some very simple math shows that if the probability of A is some number p, and you give A many chances to occur, the relative frequency with which A does occur will approach the number p as the number of chances grows to infinity. This fact, that the relative frequency approaches p, is what lead people to the backward conclusion that probability is relative frequency.
The confusion was helped because people first got interested in frequentist probability by asking questions about gambling and biology. The man who initiated much of modern statistics, Ronald Aylmer Fisher, was also a biologist who asked questions like “Which breed of peas produces larger crops?” Both gambling and biological trials are situations where the relative frequencies of the events, like dice rolls or ratios of crop yields, very quickly approach the actual probabilities. For example, drawing a heart out of a standard poker deck has logical probability 1 in 4, and simple experiments show that the relative frequency of experiments quickly approaches this. Try it at home and see.
Since people were focused on gambling and biology, they did not realize that all arguments that have a logical probability do not all match a relative frequency. To see this, let’s examine some arguments in closer detail. This one is from Stove (1983; we’ll explore this argument again in Chapter 16).
Bob is a winged horse
Bob is a horse
(Screen note: this is to be read “Bob is a winged horse, therefore Bob is a horse: stuff above the line is the evidence, stuff below is the conclusion.)
The conclusion given the premise has logical probability 1, but has no relative frequency because there are no experiments in which we can collect winged horses named Bob (and then count how many are named Bob). This example might appear contrived, but there are others in which the premise is not false and there does or can not exist any relative frequency of its conclusion being true; however, a discussion of these brings us further than we want to go in this book.
A prime difficulty of frequentism is that we have to imagine the experiments that pertain to an argument if we are to calculate its relative frequency. In any argument, there is a class of events that are to be called “successes” and a general class of events that are to be called “chances.” Think of the die roll: success are sixes and chances are the number of rolls. While this might make sense in gambling, it fails spectacularly for arguments in general. Here is another example, again adapted from Stove.
Miss Piggy loved Kermit
Kermit loved Miss Piggy
What are the class of successes and chances? The success cannot be the unique event “Kermit loved Miss Piggy” because there can be no unique events in frequentism: all events must be part of a class. Likewise, the chances cannot be the unique evidence “Miss Piggy loved Kermit.” We must expand this argument to define just what the success and chances are so that we can calculate the relative frequencies. It turns out that this is not easy to do. This argument has three different choices! The first
Miss Piggy loved X
X loved Miss Piggy
Y loved Kermit
Kermit loved Y
Y loved X
X loved Y
Evidence (from repeated viewings of The Muppet Show) suggests that the logically probability and frequency of (A) is 0. Any definition of successes and chances based on this argument (so that we can actually compute a relative frequency) should match the logical probability and relative frequency of (A). Now, because of Miss Piggy’s devotion, the relative frequency of (B) seems to match that of (A) where we have filled in the variable X for Kermit, a perfectly acceptable way to define the reference classes. But we are just as free to substitute Y for Miss Piggy. However, the relative frequency of (C) is about 0.5 and does not, obviously, match that of (A) or (B). Finally, under the rules of relative frequency, we can substitute variables for both our protagonists and see that the frequency of (D) is nothing like the frequency of any of the other arguments. Which is the correct substitution to define the reference class? There is no answer.
It’s worse than it seems, too, even for the seemingly simple example of the die toss. What exactly is the chance class? Tossing this die? Any die? And how shall it be tossed? What will be the temperature, dew point, wind speed, gravitational field, how much spin, how high, how far, for what surface hardness, and on and on to an infinite progression of possibilities, none of them having any particular claim to being the right class over any other. The book by Cook (2002) examines this particular problem in detail. And Hajek (1996) gives examples of fifteen—count `em—fifteen more reasons why frequentism fails, most of which are beyond what we can look at in this book.
These detailed explanations of frequentist peculiarities are to prepare you for some of the odd methods and the even odder interpretations of these methods that have arisen out of frequentist probability theory over the past ~100 years. We will meet these methods later in this book, and you will certainly meet them when reading results produced by other people. You will be well equipped, once you finish reading this book, to understand common claims made with classical statistics, and you will be able to understand its limitations.
1While an incredibly bright man, Fisher showed that all of us are imperfect when he repeatedly touted a ridiculously dull idea. Eugenics. He figured that you could breed the idiocy out of people by selectively culling the less desirable. Since Fisher also has strong claim on the title Father of Modern Genetics, many other intellectuals—all with advanced degrees and high education—at the time agreed with him about eugenics.
2Stolen might be a more generous word, since I copy this example nearly word for word.