Secret Data Sunday: Why does it Matter?

Bernie Sanders made some useful comments last week about the attempt, ultimately successful, to prevent Ann Coulter from speaking at Berkeley:

“What are you afraid of ― her ideas? Ask her the hard questions,” he concluded. “Confront her intellectually. Booing people down, or intimidating people, or shutting down events, I don’t think that that works in any way.”

I totally agree with Sanders and anyone on the fence on this issue should read this article about how Georgetown students politely put tough questions to Sebastian Gorka who had no answers and fled.

Sure, you might say, but what does this have to do with people who hide their data?

The connection has to do with confidence …  or lack thereof.

If you fear that Ann Coulter will run rings around you in a debate then why not try to shut her down before she gets the chance?  But if you are confident that you can outmaneuver Seb Gorka then why not exchange views with him in public?

Similarly, if you are afraid that an independent researcher might expose embarrassing weaknesses in your data and/or analysis then you are drawn to hiding your data.  But if you are confident in your data and your work then  you are not afraid of outside scrutiny.  In fact, you positively welcome outside scrutiny because you might learn something useful from it.

Another parallel between the two situations is that in both cases the choice of remaining closed should not be an allowable option.  Berkeley should have let Coulter speak (after having invited her) regardless of whether or not the dominant locals there are afraid of her.  Similarly, the dataset for the UN-sponsored Iraq Child and Maternal Mortality Survey really should be in the public domain even though releasing it will embarrass UNICEF and some people associated with the survey.

Over the next few weeks I’ll continue to give examples of people hiding important conflict datasets.  I believe that lack of confidence is a common denominator that runs underneath all these situations.

We need to draw appropriate inferences when we ask for data and the answer is “no”.


I’m tweeting

I finally started tweeting from @Michael_Spagat

Please follow me!

I plan to do more than just robo tweets of my blog posts.  For now I’m expecting to cover the usual subject matter of the blog plus Trump and probably a little Brexit.   But we’ll see how it goes.


Data Dump Friday

I suppose it will come as no surprise that I’m putting up a bit more Iraq public opinion survey data sponsored by the US State Department and obtained through a FOIA.  This time it’s some polls from April of 2006.

I’m unlikely to dump more data over the next three weeks because I’ll be traveling and I’m still not set up with all the ingredients to do these while traveling.  However, I should be doing a fair amount of regular blogging.

Secret Data Sunday – The Iraq Child and Maternal Mortality Survey

Many readers of the blog know that there was a major cock-up over child mortality figures for Iraq.  In fact, exaggerated child mortality figures have been used to justify the 2003 invasion of Iraq, both prospectively and retrospectively.

Here I won’t repeat the basics one more time, although anyone unfamiliar with this debacle should click on the above link which, in turn, offers further links providing more details.

Today I just inject one new point into this discussion – the dataset for the UNICEF survey that wildly overestimated Iraq’s child mortality rates is not available.  (To be clear, estimates from this dataset are available but the underlying data you need to audit the survey are hidden.)

The hidden survey is called the Iraq Child and Maternal Mortality Survey  (ICMMS).  This graph (which you can enlarge on your screen) reveals the ICMMS as way out of line with no fewer than four subsequent surveys, all debunking the stratospheric ICMMS child mortality estimates.  The datasets for three of the four contradicting surveys are publicly available and open to scrutiny (I will return to the fourth of the contradicting surveys in a future blog post.)

But the ICMMS dataset is nowhere to be found – and I’ve looked for it.

For starters, I emailed UNICEF but couldn’t find anyone there who had it or was willing to share it.

I also requested the dataset multiple times from Mohamed Ali, the consulting statistician on the survey who now is at the World Health Organization (WHO).

At one point Mohamed directed me to the acting head of the WHO office in Iraq who blew me off before I had a chance to request the data from him.  But, then, you have to wonder what the current head of the WHO office in Iraq has to do with a 1990’s UNICEF survey, anyway.

I persisted with Mohamed who then told me that if he still has the data it would be somewhere on some floppy disk.  This nostalgic reminder of an old technology is kind of cute but doesn’t let him off the hook for the dataset which I never received on a floppy disk or otherwise.

There is a rather interesting further wrinkle on this saga of futility.  The ICMMS dataset was heavily criticized in research commissioned for the UN’s oil for food report:

It is clear, however, that widely quoted claims made in 1995 of 500,00 deaths of children under 5 as a result of sanctions were far too high;

John Blacker, Mohamed Ali and Gareth Jones then responded to this criticism with a 2007 academic article defending the ICMMS dataset:

A response to criticism of our estimates of under-5 mortality in Iraq, 1980-98.


According to estimates published in this journal, the number of deaths of children under 5 in Iraq in the period 1991-98 resulting from the Gulf War of 1991 and the subsequent imposition of sanctions by the United Nations was between 400,000 and 500,000. These estimates have since been held to be implausibly high by a working group set up by an Independent Inquiry Committee appointed by the United Nations Secretary-General. We believe the working group’s own estimates are seriously flawed and cannot be regarded as a credible challenge to our own. To obtain their estimates, they reject as unreliable the evidence of the 1999 Iraq Child and Maternal Mortality Survey–despite clear evidence of its internal coherence and supporting evidence from another, independent survey. They prefer to rely on the 1987 and 1997 censuses and on data obtained in a format that had elsewhere been rejected as unreliable 30 years earlier.

For the record, the Blacker, Ali and Jones article is weak and unconvincing and I may make it the subject of a future blog post.  But today I just concentrate on the (non)availability of the ICMMS dataset so I won’t wander off into a critique of their article.

Thinking purely in terms of data availability, the 2007 article raises some interesting questions.  Was Mohamed Ali still working off of floppy disks in 2007 when he published this article?  Surely he must have copied the dataset onto a hard disk to do the analysis.  And what about his co-authors?  They must have the dataset too, no?

Unfortunately, John Blacker has passed away but Gareth Jones is still around so I emailed him asking for the ICMMS dataset which he had defended so gamely.

He replied that he didn’t have never had access to the dataset when he wrote the 2007 article and still doesn’t have access now.  [MS – I reviewed the correspondence a few weeks after writing this post and realized that Jones did have access to the data way back when he worked for UNICEF but lost it after retiring a long time ago.  So he has seen the data but didn’t have it when writing his academic article defending it.]

Let that point sink in for a moment.   Jones co-authored an article in an academic journal, the only point of which was to defend the quality of a dataset.  Yet, he never saw didn’t have access to the dataset that he defended?  Sorry but this doesn’t work for me.  As far as I’m concerned when you write an article that is solely about the quality of a dataset then you need to at least take a little peek at the dataset itself.

I see two possibilities here and can’t decide which is worse.  Either these guys are pretending that they don’t have a dataset that they actually do have because they don’t want to make it public or they have been defending the integrity of a dataset they don’t even have.  Either way, they should stop the charade and declare that the ICMMS was just a big fat mistake.

I have known for a long time that the ICMMS was crap but the myth it generated lives on.  It is time for the principle defenders of this sorry survey to officially flush it down the toilet.

Data Dump Friday Returns!

Hello again.

I’ve loaded up the data page with more than 20 new public opinion surveys.  That’s the good news.

The bad news is that these are all surveys fielded in Iraq by D3 Systems and KA Research Limited.  So I expect them to be loaded with fabricated data.  Ultimately, I think the main value of these datasets will be that they will help people to refine their skills at detecting fabricated survey data.

I’ve now posted everything I have by D3/KA.

There is more D3/KA Iraq data out there but organizations like ABC News and Langer Research Associates are hiding what they have.  There have been some developments on this front which I will start sharing in a new regular feature of this web site which I will call Secret Data Sunday.

Stay tuned.


New Paper on Accounting for Civilian War Casualties

Hello everybody.

The radio silence was much longer than intended but blog posts should start coming fast and furious now.  I’ve got a lot I want to get off my chest as soon as possible.

Let’s get the ball rolling with a new paper I have with Nicholas Jewell and Britta Jewell.  (Well, to be honest, it isn’t really a brand new paper but it’s newly accepted at a journal and we’re now putting it into the public domain.)

I dare say that this paper is a very readable introduction to civilian casualty recording and estimation, that is, to most of the subject matter of the blog.  I hope you will all have a look.

And, please, send in your comments..

More soon…..

PS – Here is an alternative link to the paper in case the first one doesn’t work for you.