Gallup has published two new polls. The first estimates the percent of those desiring non-procreative sex in each state. The second guesses the percent of non-affiliation with traditional religion (Christianity). We can learn some simple statistics by examining both together.
The first poll is “Vermont Leads States in LGBT Identification“, which is slightly misleading. Vermont comes in at 5.3% sexually non-procreative, but Washington DC is a whopping (and unsurprising) 8.6%. South Dakota is the most procreative (relatively speaking) state at only 2%.
This assumes, as all these polls do, that everybody tells the truth. That’s a suspect assumption here—in both directions. People in more traditional places might be reluctant to admit desiring non-procreative sex, while those in hipper locales might be too anxious. So, there is a healthy plus-or-minus attached to official numbers. Gallup puts this at +/- 0.2 to 1.6 percent, depending on the sample from each state. But that’s only the mathematical uncertainty, strictly dependent on model assumptions. It does not include lying, which must bump up the numbers. By how much nobody knows.
Poll number two is “The Religious Regions of the U.S.“, which is “based on how important people say religion is to them and how often they attend religious services.” Make that traditional religious services. The official religion of the State is practiced by many, though they usually don’t admit to that religion being a religion, and those who say they don’t attend services may still dabble in yoga, equality, and so forth. This makes the best interpretation of “not religious” as used in the poll as “not traditionally religious”, which is to say, not Christian (for most the country). The official +/- are 3-6%, depending on the state.
Here is what statisticians call a correlation:
A glance suggests that as traditional irreligion (henceforth just irreligion) increases, so too does non-procreative sex. But there is no notion of direction of cause. It’s plausible, and even confirmed in some cases, that lack of religion drives people to identify as sexually non-procreative. But it’s also possible, and also confirmed by observation, that an increase in numbers of sexually non-procreative causes others to abandon traditional religion.
Now “cause” here is used in a loose sense, as one cause of many, but a notable one. It takes more than just non-procreative sex for a person to abandon Christianity, and it takes more than abandoning Christianity to become sexually non-procreative. And, indeed, the lack of cause is also possible. Some sexually non-procreative remain religious, and most atheists are not sexually non-procreative (but see this).
All this means is that imputing cause from this plot cannot be done directly. It has to be done indirectly, with great caution, and by using evidence beyond the data of the plot. Here, the causes, if confirmed, are weak in the sense that they are only one of many. Obviously some thing or things cause a person to abandon traditional (assuming they held it!), and some thing or things cause a person to become sexually non-procreative. Religion and the presence of non-procreative sex are only one of these causes, and even not causes at all in some cases.
The best that we can therefore do is correlation. We can use the data to predict uncertainty. But in what? All 50 states plus DC have already been sampled. We don’t need to predict a state. We do not need any statistical model or technique—including hypothesis testing or wee p-values—if our interest is in states. Any hypothesis test would be badly, badly misplaced. We already know we cannot identify cause, so what would a hypothesis test tell us? Nothing.
Now states are not homogeneous. New York, for instance, is one tiny but well-populated progressive enclave appended on a massive but scarcely populated traditionalist mass (with some exceptions in the interior). If we assume the data will be relevant and valid for intra-state regions, then we can use it to predict uncertainty.
For instance, counties. If we knew a county’s percent of irreligion, we could predict the uncertainty in the percent of sexually non-procreative. Like this:
That envelope says, given all the assumptions, the old data, and assuming a regression is a reasonable approximation (with “flat priors”), there is an 80% a county’s percent sexual non-procreative would lie between the two lines, given a fixed percent irreligion. This also assumes the data are perfectly measured, which we know they are not. But since we do not know how this would add formally to the uncertainty, we have to do this informally, mentally widening the distance between the two lines by at least a couple of percent. Or by reducing that 80%.
Example: if percent irreligion is 20%, there is less than an 80% chance percent non-procreative sexually is 2.1-4.2%. And percent irreligion is 40%, there is less than an 80% chance percent non-procreative sexually is 3.1-5.2%.
These probabilities are exact given we accept the premises. We can already see, however, the model is weak; it does not explain places like DC. How would it work in San Francisco? Or Grand Rapids, Michigan?