Who is more likely to support the death penalty: college undergraduates from a “nationally-ranked Midwestern university with an enrollment of slightly more than 20,000” majoring in social work, or those majoring in something else?
This question was asked by Sudershan Pasupuleti, Eric Lambert, and Terry Cluse-Tolar at the University of Toledo to 406 students, 234 of which were social work undergraduates. The answer was published in the the .Journal of Social Work Values and Ethics
“58% of the non-social work majors favored to some degree capital punishment” and only “36% of social work students” did. They report that these percentages (58% vs. 36%) represent a statistically “significant difference in death penalty support between social work and non-social work majors.” The p-value (see below) was 0.001.
What does statistically significant mean? Before I tell you, let me ask you a non-trick question. What is the probability that, for this study, a greater percentage of non-social work majors favored the death penalty? The probability is 1: it is certain that a greater percentage of non-social work majors favored the death penalty, because 58% is greater than 36%. The answer would be the same if the observed percentages were 37% and 36%, right? The size of the difference does not matter: different is different. Significance is not a measure of the size of the difference. Further, the data we observed tells us nothing directly about other groups of students (who were not polled and whose opinions remain unknown). Neither does significance say anything about new data: significance is not a prediction.
Since significance is not a direct statement about data we observed nor is it a statement about new data, it must measure something that cannot be observed about our current data. This occultism of significance begins with a mathematical model of the students’ opinions; a formalism that we say explains the opinions we observed: not how the students formed their opinions, only what they would be. Attached to the model are unobservable objects called parameters, and attached to them are notions of infinity which are so peculiar that we’ll ignore them (for now).
A thing that cannot be observed is metaphysical. Be careful! I use these words in their strict, logical sense. By saying that some thing “cannot be observed”, I mean just that. It is impossible—not just unlikely—to measure its value or verify its existence. We need, however, to specify values for the unobservable parameters or the models won’t work, but we can never justify the values we specify because the parameters cannot be seen. This predicament is sidestepped—not solved—by saying, in a sense, we don’t care what values the parameters take, but they are equal in all parts of our model. For this data, there are two parts: one for each of non-social majors and social majors.
Significance takes as its starting point the model and the claim of equality of its parameters. It then calculates a probability statement about data we could have seen but did not, assuming that the model is true and its parametric equalities are certain: this probability is the p-value (see above) which has to be less than 0.05 to be declared “significant.”
Remember! This probability says nothing directly about the actual, observed data (nor does it need to, because we have complete knowledge of that data), nor does it say anything about data we have not yet seen. It is a statement about and conditional on metaphysical assumptions—such that the model we picked is true and its parameters equal—assumptions which, because they are metaphysical, can never be checked.
Pasupuleti et al. intimated they expected their results, which implies they were thinking causally about their findings, about what causes a person to be for or against capital punishment. But significance cannot answer the causal question: is it because the student was a social work major that she is against capital punishment? Significance cannot say why there was a difference in the data, even though that is the central question, and is why the study was conducted. We do not need significance to say if there was a difference in the data, because one was directly observed. And significance cannot say if we’d see a difference (or calculate the probability of a difference) in a new group of students.
There was nothing special about Pasupuleti’s study (except that it was easy to understand): any research that invokes statistical significance suffers from the same limitations, the biggest of which is that significance does not cannot do what people want it to, which is to give assurance that the differences observed will persist when measured in new data.
Statistical significance, then, is insignificant, or insufficient, for use in answering any practical question or in making any real decision. Why, then, is significance used? What can be done instead? Stick around.