False is not True
We spoke earlier of falsification and why I didn’t think it was an especially useful criterion. My tweets about it inspired Deborah Mayo, who advocates for the new way of statistics (whereas I vouch for the old), to respond: “It’s not low prob but a strong arg from coincidence that warrants falsifying in sci. Essence -weasel word.” She links to her article on falsification, which I’ll assume you’ll read. Essence we’ll do later, though I responded “How can you tell a weasel from a statistician without essence? Answer: you cannot.”
Today a brief explanation why falsification isn’t exciting. For details, see my book Uncertainty: The Soul of Modeling, Probability & Statistics.
Falsified means something very specific. Words matter. We are not free to redefine them willy-nilly.
(Please forgive the use of notation; though it often leads to the Deadly Sin of Reification, it can often scoot things along.)
If a model X (a list of premises, data, etc.) for observable Y says Pr(Y | X) = p > 0, and Y subsequently is observed then X has not been falsified. And this is so even if p is as small as you like, as long as it is greater than 0. Falsified means proved false. In mathematics and logic, prove has a definite rigorous inflexible intransigent meaning. And thank the Lord for that. I propose keeping the word prove as it is and not stretching it to make some sociologist’s job easier. (Most researchers are anxious to say X did or did not cause Y.)
If Pr(Y | X) = 0, and Y is observed—and all as in all as in all as in all conditions in X having been met—then X has been proved false, i.e. it has been falsified. It is as simple as that. (What part of X that is the problem is another matter; see the book.)
So why do people say, when Pr(Y | X) = p and p is small and Y is observed, that X has been “practically (or statistically) falsified”? Because they are mixing up probability and decision. They want a decision about X and since that requires hard work and there is a tantalizingly small p offering to take the pain away, so they say “X can’t be true because p is small.” Well, brother, X can be true if p is any number but 0. That’s not Briggs’s requirement: that’s the Universal Law of Logic.
Decision is Not Probability
That small p may be negligible to you, and so you are comfortable quashing X and therefore are willing to accept the consequences of that act of belief. But to another man, that p may not be small at all. The pain caused by incorrectly rejecting X may be too great for this fellow. How can you know? Bets, and believing or disbelieving X is a sort of bet, are individualistic not universal.
Incidentally, p is not the “p-value”. The p-value says nothing about what anybody wants to hear about (conditional statements about functions of observables given model parameters equal certain values). I’ve defined and debunked the p-value so often, that I won’t repeat those arguments here (see the boo). Pr(Y | X) makes sense to everybody.
Now if you have to act, if you have to act as if X is true or false, then you will need figure how p fits in with your consequences if X should turn out true or false. That’s not easy! It has been made way, way too easy in classical statistical methods; so easy that consequences are not even an afterthought. They are a non-thought. P-values and Bayes factors act as magic, and say, “Yes, my son, you may believe (or not) X”, but they don’t tell you why, or give you p, or tell you what will happen if the hypothesis test is wrong.
X is X
X is X, as Aristotle would agree. Make even the tiniest, weest, most microscopic change to X, such that the change cannot be logically deduced from the original X, then you no longer have X.
“Is Briggs saying that if you change X into something else, X is no longer X?”
“If X is not-X, that means Pr(Y | X) probably won’t equal Pr(Y | not-X), and so the decision one would make whether to believe X is completely different than the decision one would make whether to believe not-X.”
“Sacré bleu! He is not the same choice!”
“Hang on. ‘Not-X’ isn’t some blanket ‘X is false’ statement. If I read my logic right, and I do, this abuse of notation ‘not-X’ means some very definite, well specified model itself, say, W.”
“Ouf! You are right! There is nothing special about that p. It is wholly dependent on X. That must mean all probability is conditional, which further weakens the utility of falsifiability.”
There is also the point that you might dislike X at Pr(Y1 | X) = p1, but love X at Pr(Y2 | X) = p2. The number of observable propositions Yi that are dependent on X may be many, and the choices you make could depend on what X says about these different propositions and which Yi are observed, which not (in these equations X is fixed). But did you notice you have to wait until Y is observed before you know how the model works? Decision is not easy!
There exists classes of “machine learning” models, some of which say things like Pr(Y1 | X) = 0, i.e. they make hard predictions, like the weatherman who says, “The high will be 75 F.” If the temperature is 75.1 F, the weatherman’s model has been falsified, because he implied that any number but 75 F was impossible. Some machine “learning” models are like that. But few or none would reject X if the model was just a little off, like the weatherman was.
In other words, even though the model has been falsified, few would act on that falsification. People add “fuzz” to the predictions, which is to say, the model might insist Pr(Y1 | X) = 0, but people make a mental substitution and say either Pr(Y1 | X) = 0 + p (a false deduction) or they will agree Pr(Y1 | X) = 0 but say Pr(Y1 | W) = p, where W is not X but which is similar enough to X that “It might as well be X”. That does not follow; it is an illegal maneuver.
Of course, with the weatherman, everybody understands him to mean not “The high will be 75 F” but “The high will be about 75 F”. Then Pr(High something far from 75 | Weatherman’s model) = p where p does not equal 0. The words “something far” and “about” are indefinite. This is fine, acceptable, and desirable. The last thing in the world we need is more quantification.
Is X true?
X is an assumption, a model. We accept for the sake of argument X is true in judgments like Pr(Y | X). All of science is like this! If we want separate information whether X is true, we need statements like this: Pr (X | outside evidence). One version of is this: outside evidence = X said Y couldn’t happen, but Y did. Then Pr (X | outside evidence) = 0, i.e. X has been falsified. But another example is this: outside evidence = X said Y might happen, and Y did. Then Pr (X | outside evidence) = q, where q > 0.
There will be lots of other outside evidence, because inside most X are lots of mathematics, which we know is true based on various theorems; inside X will be data, which are assume true based on measurement (usually), and other things. So really we should be writing all along Pr (Y | X & E), where E = “background” evidence. All arguments are chains of reasoning back to indubitable propositions. Why?
No why here. That’s enough for this post. I’m abusing notation terribly here, but only because a full-blown treatment requires the sort of room you have in a book.
Did somebody say book? Uncertainty: The Soul of Modeling, Probability & Statistics.