The mean can be nasty. This should come as news to no one. But since, the good Lord knows why, this topic made it into the New York Times, it has become news and so must be discussed.
One Stephanie Coontz, associated with something called the Council on Contemporary Families—and since “families” is modified by “contemporary” one immediately has the suspicion that “families” does not mean “families”, but never mind—trumpeted a report called “The Trouble with Averages” written by some guy who sees fit to end his name with “PhD.”
The sin warned against is using just one number to summarize uncertainty in a thing which can take more than one value. The one number is the numerical average, a quantity which is often asked to bear burdens far beyond its capability.
The average is often used to define what is “normal”, with the implication that deviations from it are “Abby Something”, to quote Igor. The more slavish the devotion to this concept, the more the world appears insane, because hitting the average becomes increasingly difficult.
This applies to people and things. You can say the normal temperature is X degrees, and as long as you define exactly how this was calculated, you’re on solid ground, but only an activist would fret at any departure from this number and suspect foul play.
It might be that the average man grieves (say) 8 months after the death of his wife (one of Coontz’s example), but that doesn’t mean that a man who stops crying at 2 months is heard-hearted, nor that a man who wears sackcloth for two years is insane.
Using just the average to define “normal” in people is dangerously close to the fallacy of defining moral truths by vote. Come to think of it, isn’t that what the Diagnostic and Statistical Manual of Mental Disorders does? Plus, even “extremes” might not be “abnormal” in the sense of undesirable or harmful; it all depends on the behavior and our understanding of biology and morality.
Planning on the average for physical things can make sense, but only in the rare cases where the average is all that matters. Engineers don’t design bridges to withstand only average loads.
Unless the item of interest is fixed and unchanging, and in which case the numerical mean is all that can occur, the idea of calculating an average is to assist in quantifying the uncertainty of the thing. If a thing varies, the mean will always be incomplete and reliance on it alone will lead to over confidence.
And don’t forget: probability doesn’t exist as a physical thing; it is instead the measure of uncertainty.
Anyway, not much of a post or a lesson today. Instead I’ll put the burden on you. What are some good examples where the mean, and only the mean, is an adequate summary?
Update Coontz used the word “outliers”. There are no such things. There can be mismeasured data, i.e. incorrect data, say when you tried to measure air temperature but your thermometer fell into boiling water. Or there can be errors in recording the data; transposition and such forth. But excluding mistakes, and the numbers you meant to measure are the numbers you meant to measure, there are no outliers. There are only measurements which do not accord with your theory about the thing of interest.
Far too often I find people throwing out real data because it doesn’t fit their preconceptions, i.e. model. Nutty behavior.
Thanks to Andrew Kennett to pointing us to this topic.