Here is a question I added to my chapter on logic today.
New York City “Health Czar” Thomas Frieden (D), who successfully banned smoking and trans fat in restaurants and who now wants to add salt to the list, said in an issue of Circulation: Cardiovascular Quality and Outcomes that “cardiovascular disease is the leading cause of death in the United States.” Describe why no government or no person, no matter the purity of their hearts, can ever eliminate the leading cause of death.
I’ll answer that in a moment. First, Frieden is engaged in yet another attempt by the government to increase control over your life. Their reasoning goes “You are not smart enough to avoid foods which we claim—without error—are bad for you. Therefore, we shall regulate or ban such foods and save you from making decisions for yourself. There are some choices you should not be allowed to make.”
The New York Sun reports on this in today’s paper (better click on that link fast, because today could be the last day of that paper).
“We’ve done some health education on salt, but the fact is that it’s in food and it’s almost impossible for someone to get it out,” Dr. Frieden said. “Really, this is something that requires an industry-wide response and preferably a national response.”…”Processed and restaurant foods account for 77% of salt consumption, so it is nearly impossible for consumers to greatly reduce their own salt intake,” they wrote. Similarly, regarding sugar, they wrote: “Reversing the increasing intake of sugar is central to limiting calories, but governments have not done enough to address this threat.”
Get that? It’s nearly impossible for “consumers” (they mean people) to regulate their own salt intake. “Consumers” are being duped and controlled by powers greater than themselves, they are being forced to eat more salt than they want. But, lo! There is salvation in building a larger government! If that isn’t a fair interpretation of the authors’ views, then I’ll (again) eat my hat.
The impetus for Frieden’s latest passion is noticing that salt (sodium) is correlated—but not perfectly predictive of, it should be emphasized—with cardiovascular disease, namely high blood pressure (HBP). This correlation makes physical sense, at least. However, because sodium is only correlated with HBP, it means that for some people average salt intake is harmless or even helpful (Samuel Mann, a physician at Cornell, even states this).
What is strange is that, even by Frieden’s own estimate (from the Circulation paper), the rate of hypertension in NYC is four percentage points lower than the rest of the nation! NYC is about 26%, the rest of you are at about 30% If these estimates are accurate, it means New York City residents are doing better than non residents. This would argue that we should mandate non-city companies should emulate the practices of restaurants and food processors that serve the city. It in no way follows that we should burden city businesses with more regulation.
[E]xecutive vice president of the New York State Restaurant Association, Charles Hunt…said any efforts to limit salt consumption should take place at home, as only about 25% of meals are consumed outside the home.
“I’m concerned in that they have a tendency to try to blame all these health problems on restaurants…This nanny state that has been hinted about, or even partially created, where the government agencies start telling people what they should and shouldn’t eat, when they start telling restaurants they need to take on that role, we think its beyond the purview of government,” Mr. Hunt said.
Amen, Mr Hunt. It just goes to show you why creators and users of statistics have such a bad reputation. Even when the results are dead against you, it is still possible to claim what you want to claim. It’s even worse here, because it isn’t even clear what the results are. By that I mean, the statements made by Frieden and other physicians are much more certain than they should be given the results of his paper. Readers of this blog will not find that unusual.
What follows is a brief but technical description of the Circulation paper (and homework answer). Interested readers can click on.
The paper is the result of a survey on hypertension. Surveyors ran around the city and asked just over 3000 people to participate. A little less than 2000 agreed to help a little, and only about 1700, or 55%, gave all the information requested. Incidentally, I don’t except anybody to take my word for any of this: download the paper and follow along with my summary.
If you know the city, you know it’s blocked off by neighborhoods that tend to, roughly, fall along racial and ethnic boundaries. The researchers knew this and correctly tried to sample by these blocks. They couldn’t get fair representation of this groups—meaning that, say there are known to be 12% of Group A residents in the city but the survey only could find 5%. It was also the case that the response rate varied by the blocks, which should add another “grain of salt” with which you read the results. Because of these factors, they decided to use a complicated weighting formula to adjust all their results. What you are seeing, then, are not the raw numbers. You are seeing the result of an error-prone statistical model. They say:
Analyses were weighted to adjust for the complex sampling design and nonresponse; weights were poststratified to represent the NYC adult population on age, sex, race/ethnicity, and borough of residence, then further adjusted to address component- and item-level nonresponse. SUDAAN version 9.0 (Research Triangle Institute, Methods Research Triangle Park, NC) was used to apply sample weights and to obtain standard error estimates by Taylor series linearization1.
The numbers were further massaged by what is called a logistic regression model. They “adjusted [their] models based on known risk factors, including age, race/ethnicity, sex, place of birth, education, income level, insurance status, and having a routine place for care.” This is not an unusual thing to do, but readers should be aware that they are not getting direct numbers, but instead estimates of parameters of statistical models.
Readers of this blog will recall that nearly all statistical procedures are designed to make statements about (unobservable parameters) and that probability statements regarding them will always be more certain than the actual observable data which they parameterize. The short way to say this is that if you leave any statistical analyses thinking only of parameter statements, you will be too sure of yourself.
Very well. The results, as pointed out above, found that NYC residents have less hypertension than non residents. They also found “disparities”, i.e. certain groups, ages, races, etc. had hypertension rates different than other groups etc. Apparently one such “disparity” is that Mexican-Americans have lower hypertension rates than do Dominican- and Puerto Rican-Americans. And more old people than young. Most of these findings are not new.
No social medical study worth its salt will publish a paper nowadays without pointing out “disparities.” I keep meaning to show you, my loyal readers, a calculation which shows that “disparities” are inevitable in datasets of sufficient size, even when nothing is going on. But that is a subject for another time.
The paper did mention several caveats and limitations, for which I applaud them. Here are some:
The present study has limitations associated with potential measurement error, survey response rates, and small sample size for subgroup analyses. First, although a clinical diagnosis of hypertension requires documentation of elevated pressures on 2 separate office visits, our measurements were collected in 1 sitting, which potentially could have caused us to overestimate disease rates…Another limitation is the 55% survey response rate. We addressed this potential selection bias through the use of survey weights that adjusted for information on age, sex, race/ethnicity, borough of residence, income, education, language spoken at home, and household size, obtained either directly from interviews or from neighborhood census data. Other potential sources of error include recall bias and measurement error…Another limitation is that the present model could not measure all aspects of access to care.
Now go back to the New York Sun article and see how many of these limitations and caveats made it into the story, and into the decision making process of Frieden.
1“Taylor series linearization” sound fancy? It isn’t. Anybody who has ever had calculus will recall that this procedure “cuts off all higher order terms” to make an approximation that is easier to compute than the original problem, meaning—surprise!—that the results will be more certain than they should be.
Given that everybody will die, and that all effects have a cause, everybody will die of some cause. In any period of time you consider in which there are at least some deaths, those deaths will therefore have various causes. One (or more) of these will outnumber the other causes. This cause (or group) will be called “The Leading Cause.”
It is not a question of logic whether seeking to change the current leading cause to another one is good.