Only a couple of days left, everybody. Then back to more interesting subjects. Incidentally: when is the best time to contact a statistician? Before you begin to collect data. You do not how to store data as well as we do.
Today we learn the main difference in philosophy in practice between objective, predictive Bayesian and classical analyses. To illustrate, I’m going to ask you some questions that will seem like tricks: but they are not. The answers are all obvious.
Here’s the situation: you are an emergency room physician checking people for suspicion of appendicitis. Suppose you are interested in whether temperature helps predict appendicitis. Another way to state this: suppose temperature is correlated with appendicitis. Yet another way: does temperature have any relationship with appendicitis? Get it? We want to check whether temperature and appendicitis have any relationship. This is what you want to know.
Now, let’s imagination that temperature is in no way associated with appendicitis. That is, knowing whether a person has a fever, is normal, or is hypothermic means nothing to appendicitis. It is the same as the number of green lollipops your corner grocery store has and that number’s association with appendicitis. That is, knowing the temperature or the number of lollipops tells you nothing about appendicitis.
Got it? Good, then answer this: Given this evidence, what is the probability that more people with high temperatures have appendicitis than do people with low temperatures? The answer is the same as this: what is the probability that more people whose corner stores have many lollipops have appendicitis than do people whose corner stores have few lollipops?
If these variables are in no way related to appendicitis, then the answer is obvious: any changes in the number of people with appendicitis is due to other factors (which we may or may not have measured). If we did see more people with appendicitis in the high lollipop (temperature) group, then this is just a coincidence. The probability is 50%
OK so far? Now let’s imagine we ran a standard (logistic) regression of temperature to predict appendicitis. The software would spit out a p-value for the classical hypothesis test with the “null” hypothesis that the parameter associated with temperature is 0. Which is to say, that temperature has no effect. Let’s imagine the p-value associated with this test is 0.003. What would a classical statistician do?
Yes, “reject” the hypothesis that the parameter associated with temperature is 0 and say, therefore, that it is non-zero. That is, he would conclude that temperature is associated with appendicitis. Very well. But our statistician at least wants to check the (subjective) Bayesian way and so computes the posterior probability distribution for the parameter.
He discovers that, given the model is correct and the data he saw, that the probability the parameter associated with temperature is greater than 0 is, say, 99.9%. Which is to say, he is pretty darn sure that temperature is positively associated with appendicitis: higher temperatures predict more appendicitis cases.
The statistician now rests. He should not. Here’s why.
Suppose we know with certainty a parameter in a logistic regression model has the value 0.000001. What is the probability that this parameter is greater than 0? Think carefully now. The answer should be obvious.
In all cases I have ever heard of, 0.000001 > 0. Therefore, the parameter is greater than zero. If this were so, the p-value we would get in any hypothesis test would show a value of 0. Now that’s significant! The posterior probability would also show that the probability the parameter is greater than 0 is 100%. Thus, we could publish our paper that the variable associated with this parameter is “highly” statistical significant, even to Bayesians.
But none of these facts answer our main question: how does temperature affect appendicitis. That’s what we started wanting to know, and with the classical analysis we stop short of learning the complete answer. Sure, the temperature parameter’s value of 0.000001 is greater than 0, but it is so small as to be useless.
What we should be doing is calculating the probability that we see more (new) people with appendicitis and high temperatures than (new) people with appendicitis and low temperatures. We already agreed that if temperature had no effect, then this probability would be 50%.
If, then, after we fit our model (with the low p-values and the high posterior probabilities), we push ahead and compute the uncertainty in new observations and so discover that the probability we see more people with high temperatures and appendicitis is only 52%, what have we learned?
Well, that while temperature may formally be related to appendicitis, its predictive value is very low, and probably negligible. Knowing the value of a person’s temperature barely—ever so barely—changed our uncertainty in the values of new observables. It changed it so little that even though all classical measures say we should consider it, we really probably shouldn’t.
And that is the difference between the old and the new.
I’m in a terrible hurry today and know that I’ve explained this badly. I’ll try again another day.