Data Dump Friday – A Film of Cats Doing Funny Things plus Somewhere Between 1 and 16 New Iraq Public Opinion Datasets

This is a great clip.

Moreover, I’ve updated the conflict data page as is my custom on Fridays.

It’s hard to quantify how much is new in these additions.

There is one file designated “STATE MEDIA – All Data”.  Then there are a bunch of files with titles like “Media – Wasit”, Media – Salah Ad Din, etc., going through governorate by governorate.  So I think that the full dataset was broken down into a bunch of mini components that are redundant.

But I figure that it’s better to post exactly what I have rather than investigating everything in detail and posting only what I think you need to have.  This isn’t the first situation like this but this case is so obvious that I figured I should comment.

Have a great weekend.

How Many People were Killed in the Libyan Conflict – Some field work that raises more questions than it answers

Hana Salama asked me for an opinion on this article. I had missed it but it is, potentially, interesting to me so I am happy to oblige her.

I’ve now absorbed it but find myself even more puzzled than I was after reading that Syria survey I blogged on a few weeks back.  Again, it looks like some people did some useful field work but the write up is so bad that it’s hard to know exactly what they did.  In fact, the Libya work is more opaque than the Syria work to the point where I wonder what, if anything, was actually done.

For orientation here is the core of the abstract:


A systematic cross-sectional field survey and non-structured search was carried out over fourteen provinces in six Libyan regions, representing the primary sites of the armed conflict between February 2011 and February 2012. Thirty-five percent of the total area of Libya and 62.4% of the Libyan population were involved in the study. The mortality and injury rates were determined and the number of displaced people was calculated during the conflict period.


A total of 21,490 (0.5%) persons were killed, 19,700 (0.47%) injured and 435,000 (10.33%) displaced. The overall mortality rate was found to be 5.1 per 1000 per year (95% CI 4.1–7.4) and injury rate was found to be 4.7 per 1000 per year (95% CI 3.9–7.2) but varied by both region and time, reaching peak rates by July–August 2011.

I’m not sure but I think the researchers (hereafter Daw et. al.) tried to count war deaths (plus injuries and displacement numbers) rather than trying to statistically estimate these numbers.  (See this paper on the distinction.)

Actually, I read the whole paper thinking that Daw et al. drew a random sample and did statistical estimation but then I changed my mind.  I got my initial impression at the beginning because they say

This epidemiological community-based study was guided by previously published studies and guidelines.

They then cite the (horrible) Roberts et al. (2004) Iraq survey as providing a framework for their research (see this and follow the links).   Since Roberts et al. was a sample survey I figured that Daw et al. was also a sample survey.  They then go on to say that

Face to face interviews were carried out with at least one member of each affected family….

This also seemed to point in the direction of a sample survey conducted on a bunch of randomly selected households.  (With this method you pick a bunch of households at random, find out how many people lived and died in each one and then extrapolate a national death rate from the in-sample death data.)

But then I realized that the above quote continues with

…listed in the registry of the Ministry of Housing and Planning

Hmmmm….so they interviewed all affected families listed in the registry of some Ministry.  This registry cannot have been a registry of every family living in the areas covered by the survey because there are far more families there than could have been interviewed on this project.  (The areas covered contain around 4.2 million people according to Table 1 of the paper and  surely Daw et al. did not conduct hundreds of thousands of interviews.)

So I’m guessing that the interviews were just of people from families on an official list of victims; killed, injured or displaced.  This guess places a lot of emphasis on one interpretation of the words “listed” and “affected” but it does make some sense.

To be clear, even interviewing one representative from every affected family would have been a gargantuan task since Daw et al. identify around 40,000 casualties (killings plus injuries) and more than 400,000 displaced people.  So we would still be talking about tens of thousands of interviews.

To be honest, now I’m wondering if all these interviews really happened.  That’s an awful lot of interviews and they would have been conducted in the middle of a war.

So now I’m back to thinking that maybe it was a sample survey of a few thousand households.  But if so then the write up has the large flaw that there is no description whatsoever of how its sample was drawn (if, indeed, there was a sample).

Something is definitely wrong here.  I shouldn’t have to get out a Ouiji board to divine the authors’ methodology.

The Syria survey discussed a few weeks ago seems to be in a different category.  For that one I have a lot of questions about what they did combined with doubts about whether their methods make sense.  But this Libya write-up seems weird to the point where I wonder whether they were actually out in the field at all.

Maybe an email to Dr. Daw will clear things up in a positive way.  With the Syria paper emailing the lead author got me nowhere but maybe here it will work.  I’m afraid that the best case scenario is that Daw et al. did some useful field work that was obscured by a poor write up and that there is a better paper waiting to get written.




Secret Data Sunday – BBC Edition Part 2 – Data Journalism with Data

Last week I described my initial attempt to obtain some Iraq survey data from the BBC.

You can skip the long back story that explains my interest in these data sets if you want.  In short, though, these award-winning polls played an important role in establishing the historical record for the latest Iraq war but they are very likely to be contaminated with a lot of fabricated data.  ABC news, and its pollster Gary Langer, are hiding the data.  But the BBC is a co-sponsor of the polls so I figured that I could just get the data from the BBC instead.  (This and this give more details on the back story.)

At first I thought, naively, that the BBC had to produce the data in response to a Freedom of Information (FOIA) request.  But when I put this theory to the test I discovered that the BBC is, essentially, immune to FOIA.

So I wrote to the Chairman of the BBC Trust (at the time, Rona Fairhead).  She quickly replied, saying that the Trust can’t intervene unless there is a complaint.  So she passed my letter on to the newsroom and eventually I heard from Nick Sutton who is an editor there.

Nick immediately plopped a bombshell into my lap.

The BBC does not have and never did have the data sets for their award-winning polls.

Studio shot of a handsome man with a confused expression

To my amazement, BBC reporting on these Iraq public opinion polls just forwarded to its trusting public whatever ABC news told the BBC to say.

Such data journalism without data is over-the-top unethical behaviour by the BBC.

However, you can’t hide data that you don’t have so the ethics issues raised here fall outside the scope of Secret Data Sunday.  Consequently, I’ll return to the data journalism issues later in a middle-of-the-week post.

Here I just finish by returning to my failed FOIA.

Why didn’t the BBC respond to my FOIA data request by simply saying that they didn’t have the data?  Is it that they wanted to hide their no-data embarrassment?   This is possible but I doubt it.  Rather, I suspect that the BBC just responds automatically to all FOIA’s by saying that whatever you want is not subject to FOIA because they might use it for journalistic or artistic purposes.  I suspect that they make this claim regardless of whether or not they have any such plans.

To British readers I suggest that you engage in the following soothing activities while you pay your £147 subscriber fee next year.  First, repeatedly recite the mantra “Data Journalism without Data, Data Journalism without Data, Data Journalism without Data,…”.  Then reflect on why the BBC is exempt from providing basic information to the public that sustains it.


The AAPOR Report on 2016 US Election Polling plus some Observations on Survey Measurement of War Deaths – Part 1

I’ve finally absorbed the report of the American Association for Public Opinion Research (AAPOR) on polling in the Trump-Clinton election.  So I’ll jot down my reactions in a series of posts  (see also this earlier post).   In keeping with the spirit of the blog I’ll also offer related thoughts on survey-based approaches to estimating numbers of war deaths.

I strongly recommend the AAPOR report.  It has many good insights and is highly readable.

That said, I’ll mostly criticize it here.

But before I proceed to the substance of the AAPOR report I want to draw your attention to the complete absence of an analogous document in the literature using household surveys to estimate war deaths.

There has been at least one notable success in survey-based war-death estimation and several notable failures.  (two of the biggest are here and here).  Yet there has not been any soul searching within the community of practitioners in the conflict field that can be even remotely compared to the AAPOR document.  On the contrary, there is a sad history of epidemiologists militantly promoting discredited work as best practice.  See, for example, this paper which concludes:

The use of established epidemiological methods is rare. This review illustrates the pressing need to promote sound epidemiologic approaches to determining mortality estimates and to establish guidelines for policy-makers, the media and the public on how to interpret these estimates.

The great triumph that drives the above conclusion is the notorious Burnham et al. (2006) study which overestimated the number of violent deaths in Iraq by at least a factor of 4 while endangering the lives of its interviewees.

Turning back to the AAPOR document, I want to underscore that AAPOR, to its credit, has produced a self-critical report and I’m benefiting here from the nice platform their committee has provided.

The report maintains a strong distinction between national polls and state polls.  Rather unfortunately though, the report sets up state pollsters as the poor cousins of the real national pollsters.

It is a persistent frustration within polling and the larger survey research community that the profession is judged based on how these often under-budgeted state polls perform relative to the election outcome.

Analogously, we might say that Democrats are frustrated by the judgments of the electoral college which keeps handing the presidency over to Republicans despite Democrat victories in popular votes.  Yes, I too am frustrated by this weird tick of the American system.  But the electoral college is the way the US determines its presidency and we all have to accept this.   And just as it would be a mistake for Democrats to focus on winning the popular vote while downplaying the electoral college, it’s also a mistake for pollsters to focus on predicting the popular vote while leaving electoral college prediction as an afterthought.

The above quote is followed by something that is also pretty interesting:

The industry cannot realistically change how it is judged, but it can make an improvement to the polling landscape, at least in theory. AAPOR does not have the resources to finance a series of high quality state-level polls in presidential
elections, but it might consider attempting to organize financing for such an effort. Errors in state polls like those observed in 2016 are not uncommon. With shrinking budgets at news outlets to finance polling, there is no reason to believe that this problem is going to fix itself. Collectively, well-resourced survey organizations might have enough common interest in financing some high quality state-level polls so as to reduce the likelihood of another black eye for the profession.

I have to think more about this but at first glance this thinking seems sort of like saying:

Look, for a while we’ve been down here in Ecuador selling space heaters and, realistically, that’s not gonna change (although we’re writing this report because our business is faltering).  But maybe next year space heater companies can donate  a few air conditioners to some needy people.  It’s naive to imagine that there will be any money in the air conditioner business in Ecuador but this charity might help us defend ourselves against the frustrating criticism that air conditioner companies are supplying a crappy product.

In other words, it’s clear that a key missing ingredient for better election prediction is more high-quality state polls.  So why is it obvious that the market will not reward more good state polls but it will reward less relevant national ones?

(Side note – I think there are high-quality state polls and I think that the AAPOR committee agrees with me on this.  It’s just that there aren’t enough good state polls and also the average quality level may be lower on state polls than it is on national ones.)

Maybe I’m missing something here.  Is there some good reason why news consumers will always want more national polls even though these are less informative than state polls are?


But maybe journalists should just do better job of educating their audiences.  A media company could stress that presidential elections are decided state by state, not at the national level, and so this election season they will do their polling state by state, thereby providing a better product than that of their competitors who are only doing national polls.

In short, there should be a way to sell high quality information and I hope that the polling industry innovates to tailor their products more closely to market needs than they have done in recent years.


Secret Data Sunday – BBC Edition Part 1

If you have spent any time on this blog you know that D3 Systems, together with KA Research Limited, fielded a lot of polls in Iraq during the occupation and that the ones I’ve managed to analyze show extensive evidence of containing fabricated data.

Some such polls were commissioned by ABC news and won big awards.  But ABC news and their pollster (Gary Langer) refuse to share their data.  This is a pretty good indication that they are well aware of the rot in their house.

It turns out that ABC news was not the sole sponsor of the series of polls in questions.  The BBC was a cosponsor.  So I figured that rather than beating my head against the wall with ABC and Gary Langer I would try with the BBC.

Sadly, it turns out that the BBC stone wall is just as solid as the ABC-Langer one.  In fact, the BBC was so stout in hiding the truth that I’ll need multiple posts to cover their reaction to the news that they are distorting the historical record on the the Iraq war.

So let’s get started.

My first try was a Freedom of Information request to the BBC asking for the data.  The one thing I learned from this denied request is that the BBC is pretty much immune to FOIA.  All they have to do is say that they plan to use the thing you want for artistic or journalistic purposes and they are done.  They don’t have to actually use what you want for such purposes – it is enough to just claim that they have a vague intention of doing so.

Below I reproduce the BBC letter which also pretty much reproduces my request.  (The formatting came out a little weird here but it should be readable.)


British Broadcasting Corporation Room BC2 A4 Broadcast Centre White City Wood Lane London W12 7TP
Telephone 020 8008 2882


Information Rights
Professor Michael Spagat

Via email:

4th May 2016
Dear M Spagat,

Freedom of Information request – RFI20160727
Thank you for your request to the BBC of 5th April 2016, seeking the following information under the Freedom of Information Act 2000:

I would like to request the datasets from six opinion polls conducted in Iraq for which BBC was a sponsor. I list them below together with links that may be helpful. The list is taken from the web site of ABC news but the BBC is a sponsor on all these polls and must have the original datasets. I want to be clear that I am asking for the detailed datasets, not just tables of processed results. If it isuseful I could send a similar dataset. But what I’m asking for should be the form in which the contractor provided the data to the BBC in the first place.

Thank you very much for your cooperation.

Here is the list:
Field dates: Feb. 17 – 25, 2009
Details: 2,228 interviews via 446 sampling points, oversamples in Anbar province, Basra city, Kirkuk city,
Mosul and Sadr City in Baghdad.
Media partners: ABC/BBC/NHK
Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul
Interviewer journal
Photo slideshow
Chart slideshow
PDF with full questionnaire
Field dates: Feb. 12 – 20, 2008
Details: 2,228 interviews via 461 sampling points, oversamples in Anbar province, Basra city, Kirkuk city, Mosul and Sadr City in Baghdad. Media partners: ABC/BBC/ARD/NHK Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul Analysis Interviewer journal Photo slideshow Chart slideshow PDF with full questionnaire


Field dates: Aug. 17-24, 2007 Details: 2,212 interviews via 457 sampling points, oversamples in Anbar province, Basra city, Kirkuk city and Sadr City in Baghdad Media partners: ABC/BBC/NHK Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul, Turkey. Analysis Interviewer journal Photo slideshow Chart slideshow PDF with full questionnaire


Field dates: Feb. 25-March 5, 2007 Details: 2,212 interviews via 458 sampling points, oversamples in Anbar province, Basra city, Kirkuk city and Sadr City in Baghdad Media partners: ABC/USA Today/BBC/ARD Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul Analysis Interviewer journal and here. Photo slideshow PDF with full questionnaire


Field dates: Oct. 8-Nov. 22, 2005 Details: 1,711 interviews via 135 sampling points, oversample in Anbar province Media partners: ABC/BBC/NHK/Time/Der Spiegel Field work: Oxford Research International Analysis Photo slideshow PDF with full questionnaire 2004 Field dates: Feb. 9-28, 2004 Details: 2,737 interviews via 223 sampling points Media partners: ABC/BBC/NHK/ARD Field work: Oxford Research International PDF with full questionnaire Photo slideshow
The information you have requested is excluded from the Act because it is held for the purposes of ‘journalism, art or literature.’ The BBC is therefore not obliged to provide this information to you and will not be doing so on this occasion. Part VI of Schedule 1 to FOIA provides that information held by the BBC and the other public service broadcasters is only covered by the Act if it is held for ‘purposes other than those of journalism, art or literature”. The BBC is not required to supply information held for the purposes of creating the BBC’s output or information that supports and is closely associated with these creative activities.1
The limited application of the Act to public service broadcasters was to protect freedom of expression and the rights of the media under Article 10 European Convention on Human Rights (“ECHR”). The BBC, as a media organisation, is under a duty to impart information and ideas on all matters of public interest and the importance of this function has been recognised by the European Court of Human Rights. Maintaining our editorial independence is a crucial factor in enabling the media to fulfil this function.
That said, the BBC makes a huge range of information available about our programmes and content on We also proactively publish information covered by the Act on our publication scheme and regularly handle requests for information under the Act.

Appeal Rights
The BBC does not offer an internal review when the information requested is not covered by the Act. If you disagree with our decision you can appeal to the Information Commissioner. The contact details are: Information Commissioner’s Office, Wycliffe House, Water Lane, Wilmslow SK9 5AF. Tel: 0303 123 1113 (local rate) or 01625 545 745 (national rate) or see .
Please note that should the Information Commissioner’s Office decide that the Act does cover this information, exemptions under the Act might then apply.

Yours sincerely,
BBC Information Rights
1 For more information about how the Act applies to the BBC please see the enclosure which follows this letter. Please note that this guidance is not intended to be a comprehensive legal interpretation of how the Act applies to the BBC.

Freedom of Information
From January 2005 the Freedom of Information (FOI) Act 2000 gives a general right of access to all types of recorded information held by public authorities. The Act also sets out exemptions from that right and places a number of obligations on public authorities. The term “public authority” is defined in the Act; it includes all public bodies and government departments in the UK. The BBC, Channel 4, S4C and MG Alba are the only broadcasting organisations covered by the Act.

Application to the BBC
The BBC has a long tradition of making information available and accessible. It seeks to be open and accountable and already provides the public with a great deal of information about its activities. BBC Audience Services operates 24 hours a day, seven days a week handling telephone and written comments and queries, and the BBC’s website provides an extensive online information resource.
It is important to bear this in mind when considering the Freedom of Information Act and how it applies to the BBC. The Act does not apply to the BBC in the way it does to most public authorities in one significant respect. It recognises the different position of the BBC (as well as Channel 4 and S4C) by saying that it covers information “held for purposes other than those of journalism, art or literature”. This means the Act does not apply to information held for the purposes of creating the BBC’s output (TV, radio, online etc), or information that supports and is closely associated with these creative activities.
A great deal of information within this category is currently available from the BBC and will continue to be so. If this is the type of information you are looking for, you can check whether it is available on the BBC’s website or contact BBC Audience Services.
The Act does apply to all of the other information we hold about the management and running of the BBC.

The BBC’s aim is to enrich people’s lives with great programmes and services that inform, educate and entertain. It broadcasts radio and television programmes on analogue and digital services in the UK. It delivers interactive services across the web, television and mobile devices. The BBC’s online service is one of Europe’s most widely visited content sites. Around the world, international multimedia broadcaster BBC World Service delivers a wide range of language and regional services on radio, TV, online and via wireless handheld devices, together with BBC World News, the commercially-funded international news and information television channel.
The BBC’s remit as a public service broadcaster is defined in the BBC Charter and Agreement. It is the responsibility of the BBC Trust (the sovereign body within the BBC) to ensure that the organisation delivers against this remit by setting key objectives, approving strategy and policy, and monitoring and assessing performance. The Trustees also safeguard the BBC’s independence and ensure the Corporation is accountable to its audiences and to Parliament.
Day-to-day operations are run by the Director-General and his senior management team, the Executive Board. All BBC output in the UK is funded by an annual Licence Fee. This is determined and regularly reviewed by Parliament. Each year, the BBC publishes an Annual Report & Accounts, and reports to Parliament on how it has delivered against its public service remit.