Chilcot on Civilian Casualties: Part 4

In October of 2004 The Lancet published a paper by Roberts et al. that estimated the number of excess deaths for the first year and a half of the Iraq war using data from a new survey they had just conducted.  (Readers wanting a refresher course on the concept of excess deaths  can go here.)

One of the best parts of the civilian casualties chapter of the Chilcot report is the front-row seat it provides for the (rather panicked) discussion that Roberts et al. provoked within the UK government.  Here the real gold takes the form of links to three separate reviews of the paper provided by government experts.  The experts are Sir Roy Anderson of the first report, Creon Butler of the second report and Bill Kirkup, CBE of the third report.

In the next several posts I will evaluate the evaluators.  I start by largely incorporating only information that was available when they made their reports.   But I will, increasingly, take advantage of hindsight..

For orientation I quote the “Interpretation” part of the Summary of Roberts et al.:

Making conservative assumptions, we think that about 100,000 excess deaths, or more have happened since the 2003 invasion of Iraq.  Violence accounted for most of the excess deaths and airstrikes from coalition forces accounted for most violent deaths.  We have shown that collection of public-health information is possible even during periods of extreme violence.  Our results need further verification and should lead to changes to reduce non-combatant deaths from air strikes.

The UK government reaction focused exclusively, so far as I can tell, on the question of how to respond to the PR disaster ensuing from:

  1.  The headline figure of 100,000 deaths which was much bigger than any that had been seriously put forward before.
  2. The claim that the Coalition was directly responsible for most of the violence.  (Of course, one could argue that the Coalition was ultimately responsible for all violence since it initiated the war in the first place but nobody in the government took such a position.)

Today I finish with two important points that none of the three experts noticed.

First, the field work for the survey could not have been conducted as claimed in the paper.  The authors write that two teams conducted all the interviews between September 8 and September 20, i.e., in just 13 days.  There were 33 clusters, each containing 30 households. This means that each team had to average nearly 40 interviews per day, often spread across more than a single sampling point (cluster).  These interviews had be on top of travelling all over the country, on poor roads with security checkpoints, to reach the 33 clusters in the first place.

To get a feel for the logistical challenge that faced the field teams consider this picture of the sample from a later, and much larger, survey – the Iraq Living Conditions Survey:

ILCS Sample

I know the resolution isn’t spectacular on the picture but I still hope that you can make out the blue dots.  There are around 2,200 of them, one for each cluster of interviews in this survey.

Now imagine choosing 33 of these dots at random and trying to reach all of them with two teams in 13 days.  Further imagine conducting 30 highly sensitive interviews (about deaths of family members) each time you make it to one of the blue points.  If a grieving parent asks you to stay for tea do you tell to just answer your questions because you need to move on instantly?

The best-case scenario is that is that the field teams cut corners with the cluster selection to render the logistics possible and then raced through the interviews at break-neck speed (no more than 10 minutes per interview).  In other words, the hope is that the teams succeeded in taking bad measurements of a non-random sample (which the authors then treat as random).  But, as Andrew Gelman reminds us, accurate measurement is hugely important.

The worst-case scenario is that field teams simplified their logistical challenges by making up their data.  Recall, that data fabrication is widespread in surveys done in poor countries.  Note, also, that the results of the study were meant to be released before the November 2 election in the US and the field work was completed only on September 20; so slowing down the field work to improve quality was not an option.

Second, no expert picked up on the enormous gap between the information on death certificates reported in the Roberts et al. paper and the mortality information the Iraqi Ministry of Health (MoH) was releasing at the time.  A crude back-of-the-envelope calculation reveals the immense size of this inconsistency:

  1.  The population of Iraq was, very roughly, 24 million and the number of people in the sample is reported as 7,868.  So each in-sample death translates into about 3,000 estimated deaths (24,000,000/7,868).  Thus, the 73 in-sample violent deaths become an estimate of well over 200,000 violent deaths.
  2. Iraq’s MoH reported 3,858 violent deaths between April 5, 2004 and October 5, 2004, in other words a bit fewer than 4,000 deaths backed by MoH death certificates.  The MoH has no statistics prior to April 5, 2004 because their systems were in disarray before then (p. 191 of the Chilcot chapter)
  3. Points 1 and 2 together imply that death certificates for violent deaths should have been present only about 2% of the time (200,000/4,000).
  4. Yet Roberts et al. report that their field teams tried to confirm 78 of their recorded deaths by asking respondents to produce death certificates and that 63 of these attempts (81%) were successful.

The paper makes clear that the selection of the 78 cases wasn’t random and it could be that death certificate coverage is better for non-violent deaths than it is for violent deaths.


There is a big, yawning, large, humongous massive gap between 2% and 81% and something has to give.


Here are the only resolution possibilities I can think of::

  1.  The MoH issued vastly more (i.e., 50 times more) death certificates  for violent deaths than it has admitted to issuing.  This seems far fetched in the extreme.
  2. The field teams for Roberts et al. fabricated their death certificate confirmation figures.  This seems likely especially since the paper reports:

Interviewers were initially reluctant to ask to see death certificates because this might have implied they did not believe the respondents, perhaps triggering violence.  Thus, a compromise was reached for which interviewers would attempt to confirm at least two deaths per cluster.

Compromises that pressure interviewers to risk their lives are not promising and can easily lead to data fabrication.

3.   The survey picked up too many violent deaths.  I think this is true and                we will return to this possibility in a follow-up post but I don’t think that            this can be the main explanation for the death certificate gap.

OK, that’s enough for today.

In the next post I’ll discuss more what the expert reports actually said rather than what they didn’t say.













8 thoughts on “Chilcot on Civilian Casualties: Part 4

  1. The 2004 Lancet survey (L1) isn’t really worth even paying attention to anymore in my opinion.

    It has to be one of the most wildly over-hyped scientific papers in human history. The bottom line is that it didn’t really prove anything. But the authors were interested in grabbing as many headlines as possible and creating a political firestorm, so they drew and promoted exaggerated conclusions from the study that aren’t properly supported by it.

    On the total deaths number (referring to both direct violent deaths and indirect deaths), the estimate was 98,000 with an error margin of 8,000-194,000. The central number here was higher than any other at the time, but all other numbers at the time were referring only to violent deaths and were still well within those margins (i.e., they were well above 8,000), meaning the study didn’t actually prove a single thing at the standard significance level that wasn’t already proven elsewhere. At best, it suggested the number of deaths might have been considerably higher than other sources had reported, and also that the war might have led to a substantial number of indirect deaths in addition to just the violent deaths being reported by most other sources at the time. That could then form the basis for a call for larger and better-resourced studies to be conducted. If the authors had stopped there, there would have been little to take issue with.

    Instead, the authors chose to claim that it proved 100,000 was somehow a minimum number of excess deaths and that the US-Coaliton was directly responsible for 84% of all violent deaths. Both of those claims were unsupportable and not close to being substantiated at any reasonable confidence level by the study or the peer-review. Those claims were political propaganda, not science, and their purpose was just to grab as many headlines as possible and create a political firestorm. That in turn gave critics and governments valid reasons for doubting, criticizing and rejecting the study, whereas if the authors had stuck to valid interpretations, there would have been little to argue with (but also very little of substance to claim from it in the first place).

    L1 is also just downright confusing because of the constant shell games that went on with the Falluja outlier cluster. It’s clear in hindsight that Roberts (L1 lead author) wanted the study to include this cluster in all its estimates, which would have meant an estimate of ~285,000, rather than the 100,000 that was published, and would have meant 84% of the violent deaths were by the Coalition. That is what Roberts wanted L1 to say, but the Lancet rejected the Falluja outlier as too unreliable and forced Roberts to exclude it. The problem is Roberts never gave up on that interpretation of the survey and constantly smuggled it all back into the narrative outside the peer-review process, by way of subsequent press releases and comments and statements to media. The two interpretations produce quite different numbers, so it then became a confusing puzzle to determine which claims were using which interpretation of the survey and which were supported or not supported by the published article.

    For example, few probably realize that this claim of 84% by Coalition, which the authors promoted heavily in the media and which Roberts would later (absurdly) assert to be “most disturbing and certain about the results”, doesn’t even appear anywhere in the peer-reviewed article. That’s no surprise because reaching that percentage requires including the Falluja outlier and treating it as representative, something which the Lancet review rejected. So what Roberts did was just smuggle it back into the mix by way of a (non-reviewed) press release for the survey and various comments and publications in the media.

    That “finding” was headline grabbing and provocative, but it was also bogus. If you exclude the Falluja cluster, as the peer-reviewed report did, the percentage for violent deaths by Coalition should be 43%, which is very similar to what IBC and all subsequent studies (including even the 2006 Lancet survey) have found on the issue for that early period. The dubious 84% claim that Roberts was pushing on the back of L1 was just flat out false and never had any merit. What it did have was the power to grab headlines and manipulate the political narrative.

    Roberts also used the outlier to convert the 100,000 estimate from a very unreliable central figure within an enormous range of possible figures, lower or higher, into a minimum number, essentially erasing the entire lower half of the published CI with interpretive sleight of hand. While the central figures for L1 would indeed go a lot higher if you include the outlier cluster, the confidence intervals grow wider too. So you get a much higher excess deaths number like 285,000 instead of 100,000 but you would also get an even wider interval stretching down even lower than 8,000. This would then have meant the study couldn’t even rule out negative excess deaths (that the war “saved lives”) at standard confidence levels, let alone support a claim that 100,000 was the lowest possible number.

    By including the outlier you would also get a much higher Coalition percentage for violent deaths (84% rather than 43%), but again the margin on this would be so enormous that you still couldn’t actually rule out 43% or even lower figures than that.

    At the end of the day, L1 is a meaningless contribution because the error margins are enormous and it was arbitrarily interpreted in two different ways whenever the authors felt like it. This means the study covers every conceivable answer one way or another. The number of excess deaths can be 285,000 or it can be 100,000, or it can be lower or higher than either. The Coalition percentage can be 84% or it can be 43%, or it can be lower or higher than either. At least one of the two interpretations is consistent with anything and everything, all the time. Really, both of them are consistent with everything all the time because the margins are so enormous.

    When something is consistent with anything, it means nothing.

    And this is all assuming that the L1 data was all actually collected in a rigorous scientific manner, something which, as you note, is pretty doubtful to begin with. I suspect that it started out trying to be scientific and then started cutting corners along the way and then ultimately, by the Fallujah cluster, became just a free-for-all of biased sampling and numbers padding.

    L1’s only “value” was in the realm of political propaganda, advancing a particular partisan narrative of the conflict before a crucial election and beyond, which was clearly the authors’ main interest all along. They were doing political activism, headline chasing and tendentious speculating, but carefully dressed up to resemble science (something which Roberts is actually pretty expert at doing, given he’s largely built his career on just that).

    Also relevant here is IBC’s point about the Chilcot report, saying, “Its close questioning of the UK Government’s behind-closed-doors thinking on Iraqi civilian casualties reveals how narrowly the Government regarded this question: not as something which victims of armed violence and their families might reasonably expect an answer to, but as a political tool in the ‘blame game’ of the war.”

    The UK government was not alone in approaching the issue exactly that way. Some just did it from the opposite direction.

    At the end of the day L1 didn’t really prove or mean much of anything. It was just over-hyped and over-interpreted beyond reason to make political points. In any case, I hear the raw data for L1 is now somehow “lost”, so while there wasn’t much to learn from it in the first place, there clearly won’t be anything further to learn from it either.


  2. Points 1 and 2 together imply that death certificates for violent deaths should have been present only about 2% of the time (200,000/4,000).

    Actually Mike, I don’t think the MoH would have been the only party issuing death certificates during this period. The Baghdad morgue and other provincial morgues were operating and I believe would have been issuing death certs. See, e.g.:

    The Lancet guys have also claimed various things like that private doctors, apparently unaffiliated with any government institutions, would also issue “death certificates”, but this is dubious and also raises the question of what exactly is the definition of a “death certificate” in the context of these surveys. It sort of sounds like a “death certificate” in the Lancet survey can be something like:

    This guy died.


    Epstein’s Mother

    In any case, the actual number of death certificates circulated would probably be a good bit higher than the 4,000 you claim, but still miles away from this 200,000 implied by treating the Falluja cluster as if it was representative. That of course suggests the Falluja cluster was not representative and the Lancet review was correct to reject it, but then that’s pretty obvious at this point for a lot of other reasons too.


  3. This is a belated thank you to Josh Dougherty of IBC for his comments. I find little to disagree with so I’ll just make a couple of quick comments.

    The Roberts et al. paper says that the coalition caused 62 or the 73 violent deaths recorded by the survey (including in the dubious Fallujah cluster). There were 52 violent deaths in the Fallujah cluster. If all of these 52 were attributed to the coalition then 10 or the remaining 21 deaths (outside Fallujah) would have been attributed to the coalition, i.e., less than half. Thus, the percent of violent deaths attributed to the coalition can drop from around 85% down below a half depending on how these deaths are distributed.

    However, so far as I’m aware, the original paper does not give the relevant breakdown between inside and outside Fallujah. So, based on the information supplied, anything between 10 and 21 coalition-attributed deaths are possible outside Fallujah.

    It’s possible that one of the authors may have supplied the relevant information post-publication. If so, I’ve love to see the link. But for now I’m filing this away as another example where it would have been easy and useful for the authors to supply useful information but they didn’t bother to do this. (For more on this issue see the next post in this series and the report of Bill Kirkup.)

    Your second comment on death certificates is potentially important. I take the point that there could be a lot more death certificates out there than I allow for in my post. It would be great to hear on this issue from someone with on the ground experience during this period.

    However, let me add just one further point on the links that Josh supplied. Roberts et al. check for death certificates within households. If an unidentified body arrives in a morgue it will not be possible to issue death certificates to households unless the body, and subsequently the household, can be identified. Undoubtedly, the morgue will have paperwork on each body received but without identification the originating households can’t have a death certificate.


  4. Update.

    In my last comment I wrote:

    “However, so far as I’m aware, the original paper does not give the relevant breakdown between inside and outside Fallujah. So, based on the information supplied, anything between 10 and 21 coalition-attributed deaths are possible outside Fallujah.

    It’s possible that one of the authors may have supplied the relevant information post-publication. If so, I’ve love to see the link. But for now I’m filing this away as another example where it would have been easy and useful for the authors to supply useful information but they didn’t bother to do this. (For more on this issue see the next post in this series and the report of Bill Kirkup.)”

    Hamit Dardagan of Iraq Body Count just supplied exactly such correspondence.

    According to a letter from Les Roberts all 12 violent deaths not attributed to the coalition occurred outside Fallujah. This means that 9 out of 21 violent deaths outside Fallujah (43%) were attributed to the coalition.

    Going by Josh’s comments it looks like he already knew that. And the speculation on this quesiton made by Bill Kirkup in his report to the UK government was correct. The claim that the coalition caused the lions share of the violent deaths is dependent on fully trusting the dubious Fallujah cluster.


