The vale of tears
I’ve been asked by several people to comment on Pasquale Cirillo and Nassim Nicholas Taleb’s paper (thankfully, not peer-reviewed, unless you count this) “On the tail risk of violent conflict and its underestimation“. That paper was written partly in response to Steven Pinker’s contention that people (us, we) have been growing less violent thanks (mainly) to progressivism and democracy (a redundancy).
We met Pinker’s book before (link) so I won’t spend any time with it, except to say that I didn’t buy his argument.
So then. What is violence? Used to be, in days of yore, people fought constantly—men, anyway, or mainly men—and with fewer deaths than our sophisticated weaponry today provides. For proof, read any ethnology, or for a decent summary Constant Battles: Why We Fight by Steve LeBlanc. Men have always conked each other upside the head. LeBlanc gives too much weight to environmental causes for wars, when status and sex are are as influential.
But that’s neither here nor anywhere: fact is, men are violent. Wars for honor are still fought, though in the West we don’t call them “wars”—but we do call them battles. Mexican drug gangs are slicing and dicing each other. Citizens here are often unhappy. ISIS is chewing up the Mideast. Truly progressive governments are discovering they could do without certain citizens (for just one example). And this unpleasantness is only a small sample in a small piece of time.
Body counts are hard to come by. Do we include old-fashioned crime? Or only bloodlettings from officially declared wars? Seems to me any natural reading of violence would include those acts which purposely end somebody’s life. That would include abortion—the violent killing of a life—euthanasia and executions. Yet those killings make some squeamish, so as a favor to delicate readers I’ll skip over them.
Don’t skip by too lightly, though. If we’re going to quantify violence—which Pinker, Cirillo and Taleb all do—we need to have a rigorous definition of what counts. Another point: including only deaths is difficult. For instance, emergency medicine (battlefield and civilian) has improved dramatically these past decades. Many who would have died now live. Point is: much depends on what we’re quantifying.
Enter Cirillo and Taleb
Cirillo and Taleb looked for historian-defined “wars” and “conflicts” and not what we today call crime. These wars were classed as “events”, except when they lasted more than 25 years when the single event was cut up into multiple “events.” This makes the data more amenable to their model, but at the cost of changing reality. There are difficulties in counting the dead in named wars. Why not a year-by-year tally of violently killed regardless under what flag? Focusing on concrete historian-generated boundaries makes for better stories, but it hinders counting. And there is other fuzziness:
Further, in the absence of a clearly defined protocol in historical studies, it has been hard to disentangle direct death from wars and those from less direct effects on populations (say blocades [sic], famine). For instance the First Jewish War has confused historians as an estimated 30K death came from the war, and a considerably higher (between 350K and the number 1M according to Josephus) from the famine or civilian casualties.
Excellent points. Famine does not kill as many (I did not say none) today after a war because food production and distribution are more robust. The authors also say, “We can assume that there are numerous wars that are not part of our sample, even if we doubt that such events are in the ‘tails’ of the distribution, given that large conflicts are more likely to be reported by historians.” More likely does not imply certainly. And they say, “events are more likely to be recorded in modern times than in the past”
Measurement error, as admitted by the authors, and is by now obvious, is a tremendous and incurable problem. What that means is that any formal model of violence is going to be too sure of itself and in a way that we can’t quantify. This is not surprising: not all uncertainty is quantifiable.
Lastly, the authors “rescaled” the event death tolls by world population estimates (more unaccounted for uncertainty). This makes some sense, but it has its limits. When Cain whacked Abel, he reduced the world’s population by a quarter, but today when Boko Haram rapes a woman to death the effect on the tax rolls is minuscule. The figure above “Rescaled death toll of armed conflict and regimes over time” in their final dataset.
An Lushan? Led a revolt against the Tang dynasty and was so nasty his own son had to take him out. Is Zhang Xianzhong isn’t on the list? His motto was “Kill. Kill. Kill. Kill. Kill. Kill. Kill.” Reports are that he enthusiastically implemented it. And where’s Mao, Stalin? Caused the death of tens of millions. Here’s a problem: it isn’t “war” or “conflict” if you kill your own subjects. More uncertainty.
Are killings unusual?
Anyway, let’s assume the picture is in the ballpark (the rescaling technique suffers in smaller populations, as with Cain and Abel) but that it represents an error-prone way of counting violence (older events are more likely to have been missed). Including our knowledge of recent history, then ask yourself: is violence going down?
Yes and No. We haven’t, as observed, had a large scale (International-Socialism, WW2, etc.) kill-off this last half century, but that in no way implies we won’t have one in the next half century, or indeed at any time. We lasted some 50 years with only (only!) small-scale slaughter, if you except abortion. But there have been plenty of similarly lengthed periods “in-between” mass killings in history. Armageddons don’t happen yearly; they are sparse, but not unusual.
And we’re done. We don’t need a model or any other form of quantification to tell us the obvious. We never need a model to tell us what we’ve seen—unless we’re using that model to tell us what happened in the presence of measurement error, like exists here. But that’s not the use to which Cirillo and Taleb put their model. They use it to tell what happened. Or, rather, not what happened, but what happened to their model.
Now in their favor, everybody does this. Nobody is content to let the data speak for itself. People will build models and say, “Here’s what happened” when what they really mean is, “Here is a replacement of reality that pleases me and has these mathematical properties.”
The only reason to build a model of violence—and it’s a darn good reason—is to predict how many dead bodies we expect to create in the future (so we know where not to be). But given all the unquantifiable uncertainties mentioned, I would have very little confidence in that model.
Causes of war
Before I discuss (briefly) Cirillo and Taleb’s model, understand this: something caused each war. There are surely similarities in causes across wars (“I want to kill,” says somebody in each), but there are always causes for each death. There is thus no “mean rate” of violence that is “natural”, a rate which propels men toward slaughter. (This summary of the paper makes this error repeatedly,) There is no such thing as a “background violence rate”: there are only causes which we may or may not fully understand. If we understood them, we’d never need statistical models.
There are mean numbers of dead bodies, of course, measured over whatever arbitrary time points we pick, but that kind of summary gains us nothing over plots like that above. That picture—how it was created, the vagaries of the data included and unseen, and the like—is the entire analysis. Cirillo and Taleb are to be congratulated for the hard work in collecting this data (they credit one Captain Mark Weisenborn). Yet putting math to the data can only produce over-certainty unless we use the math to predict what will happen. But then we have to wait and see if the model made the correct predictions (put high probability on what happened and low on what didn’t).
Obviously, we haven’t waited and so can’t say whether the model Cirillo and Taleb posit is any good. The pair present some measures of model fit, and these are of modest interest, but they are far (as in far) from proof of the model’s goodness. Don’t forget climate and sociological models also show good fit but poor predictive skill.
We must avoid the Deadly Sin of Reification. This is the false belief that, somehow, mathematics is superior to reality, that the model is good because it makes reality cleaner. Cirillo and Taleb talk about the “stochasticity of under inter-arrival times” as if wars arrive or are guided by some mathematical process. This is the sin. Wars are caused by men. Our understanding of the uncertainty of when wars start might usefully be encapsulated by a model, but that’s as far as we can go (and we haven’t yet demonstrated that that is true for Cirillo and Taleb’s model).
Wars and rumors of wars
One thing that is obvious in the plot, and from any serious reading of (non-Howard Zinn-like) history, is that large kill-offs are not especially rare. Cirillo and Taleb’s model agrees (as it must). Why are they not rare? This is key. We know wars are caused by humans, and if we accept the premise that human nature is flawed, it is rational to conclude more wars will occur and that some will cause the creation of copious coffins.
Cirillo and Taleb appear (they don’t explicitly mention it) to accept this premise. Pinker does not. That premise is the real and substantial difference between the prediction of future wars. If you believe people are perfectible through education and enlightenment, then it follows wars will decrease in number and intensity. But if you believe men will always disappoint, and given the data in plots like Cirillo and Taleb’s, then it’s only a matter of time until the next full-scale war hits.
Good thing about both of these models is that they are testable. We just have to wait and see which is true.
Update I saw in other discussions of this paper (and of my discussion) words about how times between wars are or aren’t “random.” These are all wrong-headed. Just as something caused each war, something caused each peace. Random only means unknown. Data are not random: it is only that our knowledge of their causes is incomplete. No MODELS ARE NEEDED HERE.