In part I of this series I discussed how a Columbia Journalism Review (CJR) article by Pierre Bienaimé on the number of war deaths in Iraq treats extremely noisy estimates as if they are very precise numbers.
In this post I’ll look at the article’s treatment of methodology.
Recognize that the entire article is about numbers. And when we consume numbers it is vital that we don’t just let them slide over us like the water in a morning shower. Rather, we need to think carefully about how our numbers are manufactured, i.e., – think hard about the methodologies of number production.
The CJR article offers virtually nothing on the methodologies behind the numbers it discusses and most of the threadbare material it does provide is plain wrong.
What does the CJR article say about the Iraq Body Count (IBC) methodology. IBC and collaborators (among which I count myself the proudest one) have published a string of articles that include detailed descriptions of IBC methodology so a lazy reporter can’t go terribly wrong just quoting from one of these papers. For instance Bienaimé could have just dumped this one into his article :
The IBC database was prospectively developed by the authors HD and JAS when an invasion of Iraq appeared imminent in 2003, with the aim of systematically recording and monitoring deaths of individual Iraqi civilians from armed violence ,. Data sources are mainly professional media reports, including international and Iraqi press in translation. IBC uses key-word searches to scan Internet-published, English-language press and media reports of armed violence in Iraq directly resulting in civilian death. This process uses search engines and subscription-based press and media collation services (e.g., LexisNexis). Reports are scanned from over 200 separate press and media outlets meeting IBC’s criteria: (1) public Web-access; (2) site updated daily; (3) all stories separately archived on the site, with a unique URL; and (4) English as a primary or translated language. Sources include dozens of Arabic-language news media that release violent incident reports in English (e.g., Voices of Iraq and Al Jazeera English), and report translation services such as the BBC Monitoring Unit. The three most frequently used sources are Reuters, Associated Press, and Agence France Presse. These and other international media in Iraq increasingly employ Iraqis trained in-house as correspondents. Media-sourced data are cross-checked with, and supplemented by, data from hospitals, morgues, nongovernmental organizations, and official figures .
The cost of such drag-and-drop journalism in this case would be that that the above description predates the release by Wikileaks of a vast trove of US military data that IBC has been gradually integrating into its database. Still, it would have been vastly superior to the false little snippets that CJR provides:
The organization aggregates death reports from several dozen news sources—“in other words, people who were named in a newspaper or television broadcast,” says John Tirman, a political theorist at MIT.
……and one more tidbit that could, at a stretch, be viewed as methodological information:
So-called “passively collected figures” come from tabulating external reports or other existing material—like news articles, local television reports, and statistics from morgues—as opposed to original, newly acquired data. Where civilian deaths in Iraq are concerned, passive collection is the method used by a source frequently cited in mainstream press accounts: the aforementioned Iraq Body Count
The first quote is wrong in its entirety. It low-balls the number of sources IBC use and massively constricts their scope. News wires are much more important sources for IBC than are newspapers or television broadcasts. Incident reports from US soldiers also overshadow newspapers. In addition, there are freedom of information requests, morgue data, hospital data etc..
More importantly, IBC imposes no requirement that only named victims can be recorded in the database. Whenever names are available these are recorded but, unfortunately, names are mostly not available. So such a naming requirement would substantially reduce the number of deaths covered by IBC. But the claim of a naming requirement is simply false.
Saying that the data is “passively collected” is just an empty insult. It follows a cheap debating tactic of affixing an unattractive word to something you don’t like and pretending that you are just using a technical term. This would be akin to a prosecuting attorney referring to a defendant as “the assailant” rather than as “the defendant” during court proceedings. Obviously, such games are not allowed in court and they should not be taken seriously here.
For the record, quite a few people have died collecting the information contained in the IBC data, including reporters and soldiers (entering through the Wikileaks data release). These victims and many survivors pursued the data very actively indeed. Ultimately someone from IBC sits in front of a screen and enters the information into the database but if data entry converts data collection into a “passive” undertaking then all survey data also must be classified as “passive”.
OK so much for the CJR (mis)treatment of the IBC methodology.
The other numbers in the article are from sample survey estimates. What information does Bienaimé provide on the methodologies of these surveys?
For the Lancet studies, epidemiologists at Johns Hopkins used cluster sampling to get a random selection of neighborhoods across the country. They then went knocking on doors and asking questions to determine death rates.
There is no information on how the sampling is done, what questions are asked after knocking on doors, how the estimation and uncertainty intervals are done, etc….. Saying that the methodology is a “cluster survey” is like saying that the programme for a West End theatre tonight will be “a play”. Critics panning the play can then be dismissed on the grounds that plays are known to be top-quality entertainment.
Then there is a bizarre passage that attempts to explain the idea behind what are known as “excess deaths” (to which we will return in a later post):
Several studies have sought to estimate the hidden impact of the war by comparing death rates measured before and after the conflict. The idea is that whatever difference exists between observed deaths and those representing an increased death rate since the war began can safely be attributed to war’s effects, and labeled as “excess deaths.”
….hmmmm…..so the surveys compare observed deaths with deaths “representing an increased death rate since that war began”….what????? So the interesting thing that’s measured are the unobserved deaths? Does CJR have an English speaking editor on the payroll?
That’s it on methodology.
To summarize, all information on methodology in the the CJR article is either wrong, vacuous or incoherent.