Epidemiology, Causality, And P-Values: Part I

A standard epidemiological study goes like this: people who have been “exposed” to some thing, say, cell phone radiation, are examined to discover whether or not they have some malady. Some will, some won’t.

A second group of folks who have not been “exposed” to cell phone radiation are also examined to discover whether they have the same malady. Again, some will, some won’t.

In the end, we have two proportions (fractions, or percentages) of sufferers: in the exposed and non-exposed groups. If the proportion of those who have the malady in the exposed group is larger than the proportion of those who have the malady in the non-exposed group, then the exposure is claimed, at least tacitly by scientists, and absolutely by all lawyers, to have caused the malady.

Actually, the exposure will be said to have caused the difference in proportions of the malady. But the idea is clear: if it weren’t for the exposure, the rate of sufferers in the exposed group would be “the same” as in the non-exposed group.

The statisticians or epidemiologists who publish these results, and the press-releasing officials at the institutions where the epidemiologists work, are careful, however, to never use the word “cause”.

They instead point to the exposure and say, “What else could have caused the results?” thus leaving the reader to answer the rhetorical question.

This allows wiggle room to doubt the exposure truly caused the malady, in much the same way as television newsreaders use the word “allegedly.” As in, “Smith was allegedly caught beating his victim over the head with a stick.” If upon other evidence, Smith turns out to be not guilty, reporters can truthfully, and oh-so-innocently, claim, “We never said he did it.”

The humble p-value is the epidemiologist’s accomplice in casting aspersions on the exposure as the guilty (causative) party. Here’s how.

Classical statistics allows you to pose a hypothesis, “The proportions of sufferers in the two groups, exposed and not-exposed, are identical.” The data for this hypothesis—here, just the numerators and denominators of the two proportions—are fed into software and out pops a p-value.

If that p-values is less than the mystical value of 0.05—this value has always been arbitrary, but you simply cannot talk people out of its metaphysical significance—then you are allowed to “reject” the hypothesis of proportional equality.

Meaning, in the tangled language common to statistics, you conclude that the proportions are different. This is always a fact which you already knew (either the proportions were different or they weren’t; to check, just use your eyes).

But the actual interpretation when a small p-value is found, irresistibly cherished by all even though it is contrary to theory, is that the exposure did the deed. That is, that the exposure caused the differences in proportions.

Here’s where we are so far: two groups are checked for a malady, if the proportions of sufferers are different, and if the software pops out a publishable p-value (less than 0.05), we say the difference in groups was caused by some agent.

However, a curious, and fun, fact about classical statistics is that the larger the sample of data you have, the smaller your p-value will be. Thus, just by collecting more data, you will always be assured of a successful study. Here’s an example.

Our two groups are cell phone users (exposed) and non-cell phone users (non-exposed), and our malady is brain cancer. I trot to the hospital’s brain cancer ward and tally the number of people who have and haven’t used a cell phone. These folks are the numerators in my sample.

I then need a good chunk of non-brain cancer sufferers. I am free to collect these anywhere, but since I’m already at the hospital, I’ll head to the maternity ward and count the number of people who have and haven’t used cell phones (including, of course, the newly born). None of these people have brain cancer. These folks added to the brain cancer sufferers form the denominators of my sample.

This will make for a fairly small sample of people, so it’s unlikely I’ll see a p-value small enough to issue a press release. But if I repeat my sample at, say, all the hospitals in New York City, then I’ll almost certainly have the very tiny p-value required for me to become a consultant to the lawyers suing cell phone manufacturers for causing brain cancer.

In Part II: What Could Possible Go Wrong?


  1. Good commentary on p-values including

    (1) The inertia associated with 0.05, which is approximately the proportion of US spaceflights that kill folks.

    (2) Larger sample leads to smaller p-value. If we never mention the difference between statistical significance and practical significance, we can spin our findings.

    (3) It’s not cause-and-effect, but a small p-value seems to entitle us to claim that the exposure is a “risk factor.”

    I’m careful to explain to my students that all a small p-value indicates is that the difference in proportions (or whatever) is unlikely to have occurred due to sampling error alone. AND THAT’S ALL IT INDICATES. How unlikely? That’s the p-value. Then I remind students that if they are exceptional, say one in a million, then there are about 300 people just like them in the US alone.

  2. Just letting you know how much I liked your post today. Very clear and understandable.

  3. Ray

    My favorite epidemiological study is the FDA VIOXX study at http://www.fda.gov/ohrms/dockets/ac/05/briefing/2005-4090B1_02_MERCK-Vioxx.pdf .

    This study resulted in much litigation with lawyers claiming that VIOXX caused heart attacks, based on the graph on page 142. It is perfectly obvious to me that what the graph really shows is the remarkable curative power of the placebo. If you take a placebo for 18 months, it will prevent heart attacks. Does Dr. Briggs have another explanation?

  4. Kevin

    The value of 0.05 is immaterial to proving causation. That is why one needs a dosage-response curve that rises monotonically, and some physical reason for the response. This serves essentially like two of Koch’s postulates–the effect is absent when the cause is absent, and present when the cause is.

  5. More information about the biological effects of non-ionizing radiation from wireless technology is coming out every day. Enough is not being done by cities, counties, states and the Federal Government to protect us from the potentially devastating health and environmental effects. Through the 1996 telecommunications act the telecoms are shielded from liability and oversight. Initially cell phones were released with no pre-market safety testing despite the fact the Government and the Military have known for over 50 years that radio frequency is harmful to all biological systems (inthesenewtimes dot com/2009/05/02/6458/.). Health studies were suppressed and the 4 trillion dollar a year industry was given what amounts to a license to kill.
    On it’s face, the 1996 telecommunications act is unconstitutional and a cover-up. Within the fine print city governments are not allowed to consider “environmental” effects from cell towers. They should anyway! It is the moral and legal obligation of our government to protect our health and welfare? Or is it? When did this become an obsolete concept? A cell tower is a microwave weapon capable of causing cancer, genetic damage & other biological problems. Bees, bats, humans, plants and trees are all affected by RF & EMF. Communities fight to keep cell towers away from schools yet they allow the school boards to install wi fi in all of our schools thereby irradiating our kids for 6-7 hours each day. Kids go home and the genetic assault continues with DECT portable phones, cell phones, wi fi and Wii’s. A tsunami of cancers and early alzheimer’s await our kids. Young people under the age of 20 are 420% more at risk of forming brain tumors (Swedish study, Dr. Lennart Hardell) because of their soft skulls, brain size and cell turn over time. Instead of teaching “safer” cell phone use and the dangers of wireless technology our schools mindlessly rush to wireless bending to industry pressure rather than informed decision making. We teach about alcohol, tobacco, drugs and safe sex but not about “safer” cell phone use. We are in a wireless trance, scientists are panicking while young brains, ovaries and sperm burns.

  6. Bernie

    david ehm:
    Links to studies that are not behind pay walls covering both supportive and non-supportive findings would be useful. A quick scan of reports of Dr Lennart Hardell’s research suggests that her studies fit Matt’s framework rather nicely, i.e., epidemiological.

    One newspaper reference (http://www.consumeraffairs.com/news04/2005/cell_tumors_sweden.html ) is particularly enlightening since there is a rather differnece between rural and urban residents. However, perhaps because of the reporting, the reason for this findings is spectacularly under-explained.

    Why am I skeptical? Perhaps because with the size of the effects being discovered, we shoul dbe inundated with a fairly dramatic increase in national rates of brain tumors that are not attributable to increased rates of detection.

  7. Kevin

    The report of the Swedish study that Bernie references above, is five years old and was asking people to recall cell phone usage at least five years prior to the study. Indeed, if there truly was a factor of eight difference in risk for rural vs urban users, then we ought to see a veritable epidemic at this time because of the growth of cell phone usage both in terms of population and time spent on the phone.

  8. Bernie

    Exactly. First rule in looking at this type of study is to check the reasonableness of the finding: size the problem, size the effect.

  9. Kevin


    Lord Rothchild gave two rules for analyzing reports of risk.

    1) Is the measure stated in some manner that is easy to understand like 1 in 100?
    2) Is the risk stated in some manner of exposure, such as per some unit of time, and is the unit given?

    If the answer to these is “no”, then ignore the report.

    This Swedish study manages to give us a few pieces of information such as the age range of people in the study, and size of sample, and so forth. But it didn’t tell us how many people were included in each group (this must have been a 2x2x2 study), and said the “chance” was eight times higher in one group, but was this a risk ratio, or odds ratio, and what is the actual risk? So, what looks like authoritative information is actually nothing useful.

    I wonder if this lack of usefulness is the fault of the journalist, or the fault of poorly constructed research?

  10. Bernie

    IIWTWT. The design was probably ex post facto!!

Leave a Reply

Your email address will not be published. Required fields are marked *