We have had two cases so far: arbitrary models for counterfactual Martians (Part I) and a deduced model for an urn holding dichotomous objects (Part II). The logic was identical for both, as it will be for all models. Now for an example closer to home for many users of statistics: a normal model.
Suppose we are interested in what will be the grade point average of young Susy after her first year of college. Our proposition of interest might be C = “Susy’s GPA will be G” where we can substitute for G any value between 0 and 4.
Strike that: not any value. Not all values between 0 and 4 are possible. Susy will receive a finite number of definite grades. Her grade point therefore, just as any freshman’s, can take only one of a finite number of values. Just what these are depends on the point system used at the college and the maximum and minimum number of classes that could be taken. For our purposes it’s only important to show that the values Susy’s GPA can take belong to a finite set which can be delineated (i.e. known or deduced).
However, like most other analysts we’ll ignore this information and substitute something we know is false (given the evidence we just described) and say that Suzy’s grades can take any value between 0 and 4. Given no other evidence, we are in a bind because the number of values between 0 and 4 is (uncountably!) infinite. The deduced probability of Susy taking any value—given the model that any value is possible—is 0. Screwy, ain’t it? That’s math for you.
And that is the math. Which shows you that the approximations and assumptions we use to make the math work out creates absurdities when applying that math to real-world problems. The number of possible values Susy’s GPA can take might be large, even very large, but they are still finite. If we had delineated, given the grade point system of the school, the exact set we could have used the statistical syllogism to say that, given this information, the probability of C was 1/N, where N was the cardinality (size) of the set.
To explain this man-made paradox further: suppose we do have the actual set which has N elements, e1, e2, …, eN. Given the information we have (and only this information), we deduce that the probability of Susy’s GPA equaling any ei is 1/N. But given we assume Susy’s GPA can take any value, the probability of any ei = 0. We conclude that the probability (given the information, etc.) of e1 = 0, of e2 = 0, …, of eN = 0. Summed, the probability of seeing that which we must see (relative to the true information) is 0. We are saying that Susy cannot have a GPA. Once we see the actual GPA printed on her report card, we have to admit that it isn’t possible that we’re seeing what we’re seeing, for the probability (given any number is possible) that the number we see is really there is 0.
Again, that’s math for you. The good news is that once you’ve assimilated this, you’re ready for normal distributions. But before we get to them, let me tell you what should be done. We first deduce the actual set of possible GPAs. If we like, we can introduce the round-off rule, which then forms part of our model. The round-off rule states that we round all GPAs to the nearest hundredth. We do this because we know any decision we make on GPA is indifferent to numbers different by less than a hundredth. This was our choice because of the extra-logical, extra-statistical decisions we will make. We could instead round to the nearest tenth, or thousandth, or we needn’t round at all; whatever we like.
The statistical syllogism says, in absence of any other information (we have no other), the probability of Susy’s GPA equaling ei = 1/N for any i in 1 to N. We are done! This is the final answer.
Unless we want to use the information about previous college freshman’s GPAs. We make the assumption that a sample of freshman is “like” Susy, or that Susy is “like” them. That is, we assume the information in their GPA is probative to Susy’s GPA. Is it? Maybe, maybe not. It is an assumption we make, part of the model we assume is true.
Perhaps the sample of GPAs are from a majority of “Business” majors and Susy is training to be a biologist. Is the sample probative? We just assume that it is. Later we can learn whether this assumption was useful. But for now, it’s assumption all the way.
So let’s assume. We have the old GPAs, we have the delineated set of possible GPAs, we have the result of the statistical syllogism; we have, then, all we need. The math gives us a discrete probability distribution which, after we plug the sample in, gives us the probability that Susy’s GPA takes any of the ei. These probabilities are deduced (as all probabilities are). They are true given our assumptions. The only iffy assumption we made is that the sample we used is probative.
No parameters are needed, so no “priors” are needed nor are “posteriors” given. There is no test or p-value. We only have probabilities for each ei, and no other. It’s probability all the way, from simple assumptions to deduced answers. We got just what we wanted: the probability of Susy’s actual GPA (possibly rounded) taking any value (in the set of ei). Isn’t probability as logic great?
On to normal distributions!
Update How this all relates to climate models is coming!