Do not conduct or trust any regression, or any probability model, before you understand what “control” means and does NOT mean. NO CLASS NEXT WEEK.
Video
Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty
HOMEWORK: Given below; see end of lecture. Do the code!
Lecture
You will say “I GOT IT ALREADY!” when I remind you that “control” when used in modeling almost never means control. I have to remind you though, as much as it tires me to say, because the allure of believing “control” is control is strong. If your mental loins are ungirded, as opposed to thickly girded with endlessly repeated warnings, then you will be seduced.
Review the Right way. I won’t repeat any of that here.
Let’s repeat our college college GPA and “control” (but not control) for race, High School GPA, sex, and the scores on the SAT Math and Verbal sections (with obvious shorthand below; code at bottom). Here is the R version. You’ll see something similar in whatever software you use.
What does “(Intercept)” mean here? Think before reading further.
It is the central parameter for the normal distribution characterizing our uncertainty in new GPAs for (pay attention!) non-white females with High School GPAs of 0, SATMath scores of 0, and SATVerbal scores of 0. Yet actual minimum SAT section scores are greater than 0 (they are, or were, 200).
This parameter is obviously an absurdity, and so this table is instant and complete proof that parameters (at least here) don’t exist and thus cannot have true values. It is proof any parameter-centric analysis, of the kind here, and in most scientific papers, is wrong.
As in not right. As in misleading to some degree.
I don’t know how to be any clearer about this. If you know, please let me know, so we can help others. Almost every regression you see is therefore wrong to some degree. Thus judgements reported based on them will be wrong. Plus we already know we cannot discern cause from these models, so all claims of cause will be wrong.
Still, the model, done the Right way, might be useful in turning our correlations into probabilistic predictions. That remains to be seen, because we know we already have substantial probability leakage with previous regressions. Add terms, like SAT, can only make it worse. It will be your homework to check this.
All that said, the “Estimates” are guesses for the values of the non-existent parameters. The next two columns are only in service for the P-value, the last column. The more asterisks, the more magic. Minimum magic is needed to declare The Science. We are sick unto death of P-values, but once again I remind you all uses of them are formal fallacies.
To take one example, SATM, the math score. It’s “estimate” is the change in the central parameter of the normal distribution characterizing our uncertainty in GPA for every 1 unit increase in SATM. Change-increase, change-increase, it is always change-increase. Even for dichotomous measures like Male, the increase is moving from Female (coded 0) to Male (coded 1). Go up 10 points in math score, and the change in the parameter is 10 x -0.0001 = -0.001.
The P is not wee, so researchers will say, and say with a straight face, “SAT math score has no effect on college GPA”, a causal statement they are forbidden to make but still do. It is also very likely false, given what we know about people with high and low math scores.
And so on for the other “Estimates”. I won’t bore you with them in writing. See the video. And do them for homework. And do probability leakage!
Next comes “confidence intervals”, which again mean, officially, nothing. As in nothing. One may only say the “true” value is in the interval or not. No living human being interprets confidence intervals properly. All, with no exception, cheat, and treat them as Bayesians, which we do in a moment. But there they are:
What can we say about them? Nothing. But people will say, falsely, things like “Whites score anywhere from 0.01 to 0.38 GPA points higher than non-Whites, when controlling for HSGPA, etc. etc.”
No. All we had to do is go in and count to see how Whites scores, of whatever value of HSGPA etc.
Now you see why we spent all those weeks on parameters, P-values. confidence intervals, and all that, so that we didn’t have to introduce them here the first time where the temptation to include what you already know about GPA as if it were in the model.
Read that again. It happens. You know lots of GPAs, SATs, and whatnot. You cannot help but seek to rediscover those things in the model. You seek the pattern. You see a hint of it and declare the model has confirmed the causes you thought of. No. You brought them. The model is ignorant. It seems nothing. It is dumb computer code. You speak for it.
The Bayesian way is the same, more or less. We’ll use, as we have been, the default guesses (“priors”) on the parameters, because that choice is of little consequence after choosing the model form. And there are no true values anyway.
We get output like this (code below):
The interpretations of “Estimates” (i.e. coefficients or parameters) do not change. The mean is the “point” estimate, as above. Only here the intervals have a probability interpretation. For instance, given the model and data, there is an 80% chance the “true” value of “(Intercept)” is between 0.288 and 1.116.
Effect Size: the means here and “estimates” above are called “effect sizes”. Most take that to mean the changes in GPA. Which is false. Which is wrong. As in it is not true. That is causal language. All talk about “effects” imply causes. We know this is false because we have proved previously these models don’t provide cause.
Which means all such talk is misplaced. But very extremely common.Which is yet another source of enormous over-certainty in science.
These parameters or “estimates” represent changes or modifications to—and I must insist on being verbose here—the central parameter of the normal distribution representing our uncertainty in GPA. We also have to know about the spread parameter, but that is missing, and almost always ignored. Not in the Right Way, of course.
There will also be, depending on the integration method used, in the Bayesian printouts various matters about convergence of the “estimates.” These are important in many cases, but since the integration method used is only a tool, its philosophical importance is small. Naturally, since it’s small, it becomes the object of most curiosity to some. We’ll look into those another day, because they are really beside the point.
Homework: do it all yourself, with probability leakage. Or, better, find your own dataset to do it on.
require(Stat2Data)
require(rstanarm)
data(FirstYearGPA)
# frequentist
fit = glm(GPA ~ White+HSGPA+Male+SATM+SATV, data=FirstYearGPA)
summary(fit)
confint(fit)
# Bayesian
fit = stan_glm (GPA ~ White+HSGPA+Male+SATM+SATV, data=FirstYearGPA,iter=1e5)
print(summary(fit), digits = 3)
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use PayPal. Or use the paid subscription at Substack. Cash App: \$WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.
Discover more from William M. Briggs
Subscribe to get the latest posts sent to your email.



