My Free Online Course is Ready and About to Launch!

Hello everyone.

I haven’t posted for a while, mainly because I’ve been completely swamped writing with creating my free online course  which launches on Monday.  

The course is on exactly the sort of material I cover on the blog so if you’re following the blog you should seriously consider signing up for the course.  It’s called:

“Accounting for Death in War: Separating Fact from Fiction”


Secret Data Sunday – AAPOR Investigates the Trump-Clinton Polling Miss Using Data you Can’t See

The long-awaited report from the American Association for Public Opinion Research (AAPOR) on the performance of polling in the Trump-Clinton race is out.  You will see that this material is less of a stretch for the blog than it might seem to be at first glance and I plan a second post on it.

Today I just want to highlight the hidden data issue which rears its head very early in the report:

The committee is composed of scholars of public opinion and survey methodology as well as election polling practitioners. While a number of members were active pollsters during the election, a good share of the academic members were not. This mix was designed to staff the committee both with professionals having access to large volumes of poll data they knew inside and out, and with independent scholars bringing perspectives free from apparent conflicts of interest. The report addresses the following questions:

So on the one hand we have pollsters “having access to large volumes of poll data” and on the other hand we have “independent scholars” who….errr….don’t normally have access to large volumes of polling data because the pollsters normally hide it from them.   (I’m not sure what the apparent conflict of interest of the pollsters is but I guess it’s that they might be inclined to cover up errors they may have made in their election forecasts.)

You might well ask how come all these datasets aren’t in the public domain?


Sadly, there is no good answer to that question.

But the reason all these important data remain hidden is pretty obvious.  Pollsters don’t want independent analysts to embarrass them by finding flaws in their data or their analysis.

This is a bad reason.

There is a strong public interest in having the data available.  The data would help all of us, not just the AAPOR committee, understand what went wrong with polling in the the Trump-Clinton race.  The data would also help us learn why Trump won which is clearly an important question.


But we don’t have the data.

I understand that there are valid commercial reasons for holding polling data privately while you sell some stories about it.  But a month should be more than sufficient for this purpose.

It is unacceptable to say that sharing requires resources that you don’t have because sharing data just doesn’t require a lot of resources.  Yes, I know that I’ve whinged a bit on the blog about sharing all that State Department data and I’m doing it in tranches.  Still, this effort is costing me only about 15-30 minutes per dataset.  It’s really not a big deal.

I suppose somebody might say that these datasets are collected privately and so it’s OK to permanently keep them private.  But election polls drive public discussions and probably affect election outcomes.  There is a really strong public interest in disclosure.

There is further material in the report on data openness:

Since provision of microdata is not required by the AAPOR Transparency Initiative, we are particularly grateful to ABC News, CNN, Michigan State University, Monmouth University, and University of Southern California/Los Angeles Times for joining in the scientific spirit of this investigation and providing microdata. We also thank the employers of committee members (Pew Research Center, Marquette University, SurveyMonkey, The Washington Post, and YouGov) for demonstrating this same commitment.

I’ve written before about how AAPOR demands transparency on everything except the main thing you would think of when it comes to survey transparency – showing your data.

I’ll return to this AAPOR problem in a future Secret Data Sunday.  But for now I just want to say that the Committee’s appeal to a “scientific spirit” falls flat.  Nobody outside the committee can audit the AAPOR report and it will be unnecessarily difficult to further develop lines of inquiry initiated by the report for one simple reason; nobody outside the committee has access to all of the data the committee analyzed.  This is not science.

OK, that’s all I want to say today.  I’ll return to the main points of the report in a future post.

I’ve Done Something or Other and Say that 470,000 People were Killed in Syria – Would you Like to Interview Me?

Let’s go back to February of 2016 when the New York Times ran this headline:

Death Toll from War in Syria now 470,000, Group Finds

The headline is more conservative than a caption in the same article which reads:

At least [my emphasis] 470,000 Syrians have died as a result of the war, according to the Syrian Center for Policy Research.

This switch between the headline and the caption is consistent with a common pattern of converting an estimate, that might be either too high or too low, into a bare minimum.

Other respected outlets such as PBS, and Time jumped onto the 470,000 bandwagon with the Guardian claiming primacy in this story with an early exclusive that quotes the report’s author:

“We use very rigorous research methods and we are sure of this figure,” Rabie Nasser, the report’s author, told the Guardian. “Indirect deaths will be greater in the future, though most NGOs [non-governmental organisations] and the UN ignore them.

“We think that the UN documentation and informal estimation underestimated the casualties due to lack of access to information during the crisis,” he said.

Oddly, none of the news articles say anything about what this rigorous methodology is.  The Guardian refers to “counting” which I would normally interpret as saying that the Syrian Center for Policy Research (SCPR) has a list of 470,000 people killed but it is not at all clear that they really have such a list.

This report was the source for all the media attention.  The figure of 470,000 appears just once in the report, in a throwaway line in the conclusion:

 The armed conflict badly harmed human development in Syria where the fatalities in 2015 reached about 470,000 deaths, the life expectancy at birth estimated at 55.4 years, and the school age non-attendance rate projected at 45.2 per cent; consequently, the HDI of Syria is estimated to have lost 29.8 per cent of its HDI value in 2015 compared to 2010.

The only bit of the report that so much as hints at where the 470,00 number came from is this:

The report used results and methodology from a forthcoming SCPR report on the human development in Syria that is based on a comprehensive survey conducted in the mid of 2014 and covered all regions in Syria. The survey divided Syria into 698 studied regions and questionnaire three key informants, with specific criteria that guarantee inclusiveness and transparency, from each region. Moreover, the survey applied a strict system of monitoring and reviewing to ensure the correctness of responses. About 300 researchers, experts, and programmers participated in this survey.

This is nothing.

The hunger for scraps of information on the number of people killed in Syria is, apparently, so great that it is feasible to launch a bunch of news headlines just by saying you’ve looked into this question and come up with a number that is larger than what was previously thought.  (I strongly suspect that having a bigger number which you use to dump on any smaller numbers is a key part of getting noticed.)

That said, the above quote does promise a new report with more details and eventually a new report was released – but the details in the new report on methodology are still woefully inadequate.  They divide Syria up, interview three key informants in each area and then, somehow, calculate the number of dead people based on these interviews.  I have no idea what this calculation looks like.  There is a bit of description on how SCPR picked their key informants but, beyond that, the new report provides virtually no information relevant for evaluating the 470,000 figure.  The SCPR doesn’t even provide a copy of their questionnaire and I can hardly even guess at what it looks like.

One thing is clear though – they did not use the standard sample survey method for estimating the number of violent deaths.  Under this approach you pick a bunch of households at random, do interviews on the number of people who have lived and died in each one and extrapolate a national death rate based on death rates observed in your sample households.  If the SCPR had done something like this then at least I would’ve had a sense of where the 470,000 number came from, although I’d still want to know details.

I emailed Rabie Nasser asking for details but didn’t hear back.  Who knows.  Maybe my message went into his spam folder.  There are other people associated with this work and I’ll try to contact them and will report back if I hear something interesting.

I want to be clear.  I’m not saying that this work is useless for estimating the number of people killed in the Syrian war.  In fact, I suspect that the SCPR generated some really useful information on this question and on other issues as well.  But until they explain what they actually did I would just disregard the work, particularly the 470,000 figure.  I’m not saying that I think this number is too high or that it is too low.  I just think that it is floating in thin air without any methodological moorings to enable us to understand it.

Journalists should lay off press releases taking the form of “I did some unspecified research and here are my conclusions.”


Mismeasuring Deaths in Iraq: Addendum on Confidence Interval Calculations


In my last post I used a combination of bootstrapping and educated guesswork to find  confidence intervals for violent deaths in Iraq based on the data from the Roberts et al. survey.  (The need for guesswork arose because the authors have not been forthcoming with their data.)

Right after this went up a reader contacted me and asked whether the bottom of one of these confidence intervals can go below 0.

The short answer is “no” with the bootstrap method.  This technique can only take us down to 0 and no further.


With bootstrapping we randomly select from a list of 33 clusters.  Of course, none of these clusters experienced a negative number of violent deaths. So 0 is the smallest possible count we can get for violent deaths in any simulation sample.  (In truth, the possibility of pulling 33 0’s is more theoretical than real.  This didn’t happen in any of my 1,000 draws of 33.)

Nevertheless, it turns out that if we employ the most common methods for calculating confidence intervals (not bootstrapping) then the bottom of the interval does dip below 0 when the dubious Fallujah cluster is included.

Here’s a step by step walk-through of the traditional method applied to the Roberts et al. data.  (I will assume that violent deaths are allocated across the 33 clusters as 18 0’s, 7 1’s, 7 2’s and 1 52.)

  1. Compute the mean number of violent deaths per cluster.  This is 2.2.  An indication that something is screwy here is the fact that the mean is bigger than the number of violent deaths in 32 out of the 33 clusters.  At the same time the mean is way below the number of violent deaths in the Fallujah cluster (52).  Note that without the Fallujah cluster the mean becomes 0.7, i.e., eliminating Fallujah cuts the mean by more than a factor of 3.
  2. Compute the sample standard deviation which is a measure of how strongly the number of violent deaths varies by cluster.  This is 9.0.  Note that if we eliminate the Fallujah cluster then the sample standard deviation plummets by more than a factor of 10, all the way down to 0.8.  This is just a quantitative expression of the obvious fact that the data are highly variable with Fallujah in there.  Note further that the big outlier observation affects the standard deviation more than it affects the mean.
  3. Adjust for sample size.  We do this by dividing the sample standard deviation by the square root of the sample size.  This gives us 1.6.  Here the idea is that you can tame the variation in the data by taking a large sample.  The larger the sample size the more you tame the data.  However, as we shall see, the Fallujah cluster makes it impossible to really tame the data with a sample of only 33 clusters.
  4. Unfortunately, the last step is mysterious unless you’ve put a fair amount of effort into studying statistics.  (This, alone, is a great reason to prefer bootstrapping which is very intuitive.)  Our 95% confidence interval for the mean number of violent deaths per cluster is, approximately, the average plus or minus 2 times 1.6, i.e., -1.0 to 5.4.  There’s the negative lower bound!
  5. We can translate from violent deaths per cluster to estimated violent deaths by multiplying by 33 and again by 3,000.  We end up with -100,000 to 530,000.  (I’ve been rounding at each step.  If, instead I don’t round until the very end I get -90,000 to 530,000….this doesn’t really matter.)  Note that without Fallujah we get a confidence interval of 30,000 to 90,000 which is about what we got with bootstrapping.

Have we learned anything here other than that I respond to reader questions?

I don’t think we’ve learned much, if anything, about violent deaths in Iraq.  We already knew that the Roberts et al. data, especially the Fallujah observation, is questionable and maybe the above calculation reinforces this view a little bit.

But, mostly, we learn something about the standard method for calculating confidence intervals; when the data are wild this method can give incredible answers.  Of course, a negative number of violent deaths is not credible.

There is an intuitive reason why the standard method fails with the Roberts et al. data; it forces a symmetric estimate onto highly asymmetric data.  Remember we get 2.2 plus or minus 3.2 average violent deaths per cluster.  The plus or minus means that the confidence interval is symmetric.  The Fallujah observation forces a wide confidence interval which has to go just as wide on the down side as it is on the up side.  In some sense the method is saying that if it’s possible to find a cluster with 52 violent deaths then it also must be possible to find a cluster with around -52 violent deaths.  But, of course, no area of Iraq  experienced -52 violent deaths.  So you wind up with garbage.

Part of the story is also the small sample size. With twice as many cluster, but the same sort of data, the lower limit would only go down to about 0.

It’s tempting to just say “garbage in, garbage out” and, up to a point, this is accurate.   But the bigger problem is that the usual method for calculating confidence intervals is not appropriate in this case.

Mismeasuring War Deaths in Iraq: The Partial Striptease

I now continue the discussion of the Roberts et al. paper that I started in my series on the Chilcot Report.  This is tangent from Chilcot so I’ll hold this post and its follow-ups outside of that series.

Les Roberts never released a proper data set for his survey.  Worse, the authors are sketchy on important details in the paper, leaving us to guess on some key issues.  For example, in his report on Roberts et al. to the UK government Bill Kirkup wrote:

The authors provide a reasonable amount of detail on their figures in most of the paper.  They do, however, become noticeably reticent when it comes to the breakdown of deaths into violent and non-violent, and the breakdown of violent deaths into those attributed to the coalition and those due to terrorism or criminal acts, particularly taking into account the ‘Fallujah problem’…

Roberts et al. claim that “air strikes from coalition forces accounted for most violent deaths” but Kirkup points out that without the dubious Fallujah cluster it’s possible that the coalition accounted for less than half of the survey’s violent deaths.

Kirkup’s suspicion turns out to be correct.

However, you need to look at this email from Les Roberts to a blog to settle the issue.  It turns out that coalition air strikes outside Fallujah account for 6 out of 21 violent deaths there with 4 further deaths attributed to the coalition using other weapons.

My primary point here is about data openness rather than about coalition air strikes.  Roberts et al. should just show their data rather than dribbling it out in bibs and bobs into the blogosphere.

Roberts gives another little top up here.  (I give that link only to document my source.  I recommend against ploughing through this Gish Gallop by Les Roberts.)  Buried deep inside a lot of nonsense Roberts writes:

The Lancet estimate [i.e. Roberts et al.], for example, assumes that no violent deaths have occurred in Anbar Province; that it is fair to subtract out the pre-invasion violence rate; and that the 5 deaths in our data induced by a US military vehicles are not “violent deaths.”

Hmmm…..5 deaths caused by US military vehicles.

Recall that each death in the sample yields around 3,000 estimated deaths. This translates into 15,000 estimated deaths caused by US military vehicles – nearly 30 per day for a year and a half.  There have, unfortunately, been a number of Iraqis killed by US military vehicles.  Iraq Body Count (IBC) has 110 such deaths in its database during the period covered by the Roberts et al. survey.  I’m sure that IBC hasn’t captured all deaths in vehicle accidents but, nevertheless, the 15,000 figure looks pretty crazy.

Again I come back to my main point – please just give us a proper dataset rather than a partial striptease.  Meanwhile, I can’t help thinking Roberts et al. are holding back on the data because it contains more embarrassments that we don’t yet know about.

PS – After providing the above quote I feel obligated to debunk it further.

  1. Roberts writes that his estimate omits deaths in Anbar Province (which contains Fallujah).  But many claims in his paper are only true if you include Anbar (Fallujah).  Indeed, this very blog post opened with one such claim.  We see that Fallujah is in for the purpose of saying that most violent deaths were caused by coalition airstrikes but Fallujah is out when it’s time to talk about how conservative the estimate is because it omits Fallujah.  Call this the “Fallujah Shell Game”.  (See the comments of Josh Dougherty here.)Shell Game_Thimblerig small
  2. Roberts suggests that he bent over backwards to be fair by omitting pre-invasion violent deaths from his estimate.  But, first of all, there was only one such death so it hardly makes a difference whether this one is in our out.  Second, it’s hard to understand what the case would be for blaming a pre-invasion death on the invasion.  .


Comments Down Below!

Hello everybody.

This is just a quick note to say that there were some interesting comments that appeared on my last two post on Chilcot (here and here).  I’ve just replied to both.

While I’m at it I have a question for Bill Kirkup (who made one of the comments).  Can he give us a little briefing on how death certificates have been handled in post-invasion Iraq?

Actually, I have a number of specific questions on this subject. I’d be happy to switch to email to clear these up and then post a summary if that works best (