Class 62: Die P-value, Die! Die! Die!

Class 62: Die P-value, Die! Die! Die!

We want Pr(H|Evidence), we get Pr(Data we didn’t see | H false). And then everybody pretends they are the same. Why why why.

Video

Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty

HOMEWORK: Given below; see end of lecture.

Lecture

This is a modified excerpt from Chapter 9 of Uncertainty.

So much has been written on the dismal subject of p-values, including some above, that it seems like piling on to say more. But I do say more only to note three things, two of which are commonplace and that lead to a third which is not.

First, nobody ever remembers the definition of a p-value. Everybody translates it to the probability, or its complement, of the hypothesis at hand. For this reason alone p-values should be abandoned. Second, even some self-labeling Bayesians want to keep p-values, but in a Bayesian sense. This is to give an old error a new name, but it will still be an error. Thirdly, is something more interesting: the arguments commonly used to justify p-values are fallacies. Here is the proof.

It turns out that frequentist theory implies that the distribution of the measure of difference, like in the race-income problem, actually called the “p-value of the test statistic”, is “uniformly distributed”. What that means is discussed in the next section, but what the theory implies is (something like), “If the null is true, the p-value can be any number between 0 and 1, and is equally likely to be any of them”. The argument people employ, however, progresses like this: “The null entails that we see a p-value between 0 and 1. We see a p-value that is less than the magic number. Therefore, the null is false, or rather rejected as if it were false.”

This argument is not valid because the first premise says we can see any p-value whatsoever, and since we do (see any value), it is actually evidence for the null and not against it. There is no p-value we could see that would be the logical negation of “0< p-value <1”; other than 1 or 0, which may of course happen in practice (The simplest example is a test for differences in proportion from two groups, where $n_1=n_2=1$ and where $x_1=1, x_2=0$, or $x_1=0, x_2=1$. Small samples frequently bust frequentist methods). And when it does happen in practice, then regardless whether the p-value is 0 or 1, either of those values legitimately falsify the null, not just 0. That is, an observed p-value of 1 is evidence against the null, according to the argument.

Importantly, the first premise to that argument is not that “If the null is true, then we expect a ‘large’ p-value,” because we clearly do not. But the argument would be valid, and the null truly falsified, if the first premiss were “If the null were true we would see a large p-value,” but nowhere in the theory of statistics is this kind of statement asserted. Though something like it often is. R.A. Fisher, the inventor of p-values, was fond of saying this—and something like this is quoted in nearly every introductory textbook:

Belief in null hypothesis as an accurate representation of the population sampled is confronted by a logical disjunction: Either the null is false, or the p-value has attained by chance an exceptionally low value.

This is the same argument as before; but Fisher’s “logical disjunction” is evidently goofy, as the first part of the sentence makes a statement about the unobservable null hypothesis, and the second part makes a statement about the observable p-value. And neither says anything at all about the hypothesis itself! But it is clear that there are implied missing pieces, and his quote can be fixed easily like this: “Either the null is false and we see a small p-value, or the null is true and we see a small p-value.” Or just: “Either the null is true or it is false and we see a small p-value.”

Since “Either the null is true or it is false” is a tautology, and is therefore necessarily true and thus can be removed, we are left with,We see a small p-value.” Which is of no help at all. The p-value casts no direct light on the truth or falsity of the null. This result should not be surprising, because remember that Fisher argued that the p-value could not deduce whether the null was true; but if it cannot deduce whether null is true, it cannot, logically, deduce whether it is false; that is, it cannot falsify the null.

Current practice is that a small p-value is taken to be by everybody to mean “This is evidence the null is false or likely false.” That is because people are arguing like this: “For most small p-values I have seen in the past, the null has been false; I now see a new small p-value” as evidence for the proposition “The null hypothesis in this new problem is false.” But this doesn’t work because the major premise is false, or at least unknown.

Given all this, and of the myriad other criticisms no doubt well known to the reader, plus the ineradicable Cult of Point-Oh-Five, it is far past the time for p-values to go.

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use PayPal. Or use the paid subscription at Substack. Cash App: \$WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.


Discover more from William M. Briggs

Subscribe to get the latest posts sent to your email.

1 Comment

  1. McChuck

    “Faster, pussycat! Kill! Kill!”
    You can’t repeat the mantra enough.

Leave a Reply

Your email address will not be published. Required fields are marked *