Philosophy

All Probability Is Conditional: An Answer To Senn; Part IV

A strictly incorrect way of writing Bayes’s theorem.

Read Part III.

Still with me? Hope so, because we’re only on the second page of Senn’s article (but don’t fret; we’ll be skipping most of it).

Review: in logical-probability Bayes (as in all Aristotelian logic), we begin with a list of premises (data, observations, evidence, or other synonymous term) and a proposition which is hoped to be related to the premises; from the premises we deduce the probability the proposition is true. Not all premises are sufficient to guarantee a numerical value, nor any value, nor any precise number: the probability could be stated merely in words, nonexistent, an interval, or a precise number.

Senn writes “we let \Pr(A) stand for the so-called marginal probability of an ‘event’, ‘statement’ or ‘hypothesis’ A and we let \Pr(B|A) stand for the conditional probability…of B given A.” (Note: I have edited the notation ever so slightly so that it will render well on a web page; I have not changed any meaning.)

As before, I already disagree. There just is no such thing as unconditional probability, or probability without respect to any evidence, thus it never makes sense to write “\Pr(A).” We can write (say) \Pr(A|E) which is the probability of A (a proposition) given the evidence E (also a proposition, albeit possibly a complex one which includes data observations).

Example: A = “A ‘6’ shows” and E = “We have a Martian breen, etc.” (see the previous part for an explanation). But it makes no sense to ask, “What is the probability a ‘6’ appears” without reference to something—whether it be a breen, a die, or something else.

Failure to recognize this creates another stumbling block in understanding probability. Probability is usually introduced as unconditional, and the complexity of conditioning follows some time later: but this is a mistake. There just is no such thing as unconditional probability. Just as there is no unconditional truth (of a proposition, we always at least refer to our intuition or faith if not a list of premises.)

Of course, inside a given problem, once we have E in hand, and it’s agreed to by all and always understood to be there, and for the simple ease of notation, there is no harm in writing \Pr(A). But this should not be done until one is well used to logical probability, else it seems like probability is a thing and not a measure of knowledge.

Don’t think this is a big deal? Oh, boy, is it ever, as we’ll see.

Senn introduces H_i for a hypothesis, i.e. some proposition, indexed by i, as there may be more than one proposition which is true in some situation. He adds a superscript T to indicate we believe H_i is true, but I’ll skip this complication. Whenever we see shorthand like \Pr(H_i|E) it means “The probability H_i is true given E.”

Senn then writes (with my change in notation) “We suppose that we have some evidence E . If we are Bayesians we can assign a probability to any hypothesis H_i and, indeed, to the conjunction of this truth, H_i \& E, with evidence E.”

Subjective Bayesians would agree: logical probability Bayesians do not. Subjective Bayesians are as free with numerical probabilities as politicians are with other people’s money. Subjectivists “feel” probability is a matter of emotion because they fail to write down the conditioning premises. They may arrive at a number (and bet using it: subjectivists are inveterate gamblers; at least in theory), but this does not mean the numbers they produce have any bearing on the probabilities unless it can be demonstrated their (unwritten) premises imply these (and no other) numerical values. There is much more to the errors subjectivists make, but given that they usually only make them in theory and not in many problems, where they usually agree with LPBs, we’ll let these go until another day.

LPBs are more Socratic and admit their ignorance. The original series proves the LBPs are right: we can not always discover a probability, and so we can not always discover a numerical value for a probability. We can still manipulate the symbols until we get the form of an answer, but that doesn’t make the answer right. I believe Senn would agree with that.

Thus we can always write Bayes’s theorem, which Senn does:

\Pr(H_i | E) = \frac{\Pr(H_i) \Pr(E | H_i )}{\Pr(E)}.

Spot the trouble? What could it possibly mean to say \Pr(E)? or even \Pr(H_i)? If we knew \Pr(H_i) then we would know \Pr(H_i) and thus we wouldn’t have to bother with any kind of experiment or other evidence or indeed anything. We’d just write the answer down!

We’re in just as much trouble with \Pr(E). How can we ask about the probability of evidence we just witnessed? It has to be a probability of 1, right?, or it wouldn’t have happened!

The problem does not lie in Bayes’s theorem, which nobody disputes (how could they?), but in the way it is written and what the symbols mean. Senn is right that Subjective Bayesians can (and do) say anything, but that doesn’t mean what they say has any bearing on reality (I’ll let you provide the politician comparison).

About that, more next time.


Categories: Philosophy, Statistics

6 replies »

  1. In an Euler diagram of all possible outcomes of a given experiment, a theory is a subset of these outcomes. Different theories are different subsets.

    If a measurement Ex is not in subset Hi, Hi is false. And all the other theories Hk that contain Ex are false too.

    If a measurement Ey is in the subset Hi, then the theory Hi might be true. And all the other subsets Hk that contain Ey might be true too.

    The joint probabilities of H and E are statistically independent: Pr(Hi | Ey).Pr(Ey) = Pr(Hi).Pr(Ey), so the conditional probabilities are Pr(Hi | Ey) = Pr(Hi).

    This should be obvious: different theories can predict the same measurement. Once a measurement is predicted by some theory, it is not taken. Other theories must predict that measurement to, to be true.

  2. Sander,

    Good grief is that complicated. You lost me with subset.

    How about this? Assuming H_i is true, the probability of E is 0. We subsequently observe E. Therefore, H_i is false.

    Or assuming H_i is true, the probability of E is tiny (and as tiny as you like but not 0). We subsequently observe E. Therefore, H_i might still be true.

    Finis.

  3. “Senn is right that Subjective Bayesians can (and do) say anything, but that doesn’t mean what they say has any bearing on reality (I’ll let you provide the politician comparison).”
    Hmmm? Senn states the following in his article. He seems to say otherwise.

    Before I do so, however, I want to make one point clear. I am not arguing that the subjective Bayesian approach is not a good one to use. I am claiming instead that the argument is false that because some ideal form of this approach to reasoning seems excellent in theory it therefore follows that in practice using this and only this approach to reasoning is the right thing to do. A very standard form of argument I do object to is the one frequently encountered in many applied Bayesian papers where the first paragraphs lauds the Bayesian approach on various grounds, in particular its ability to synthesise all sources of information, and in the rest of the paper the authors assume that because they have used the Bayesian machinery of prior distributions and Bayes theorem they have therefore done a good analysis. It is this sort of author who believes that he or she is Bayesian but in practice is wrong.

  4. JH,

    Hmm. You’ll have to show me just where you think he says otherwise. Seems to me he agrees.

Leave a Reply

Your email address will not be published. Required fields are marked *