Secret Data Sunday – BBC Edition Part 2 – Data Journalism with Data

Last week I described my initial attempt to obtain some Iraq survey data from the BBC.

You can skip the long back story that explains my interest in these data sets if you want.  In short, though, these award-winning polls played an important role in establishing the historical record for the latest Iraq war but they are very likely to be contaminated with a lot of fabricated data.  ABC news, and its pollster Gary Langer, are hiding the data.  But the BBC is a co-sponsor of the polls so I figured that I could just get the data from the BBC instead.  (This and this give more details on the back story.)

At first I thought, naively, that the BBC had to produce the data in response to a Freedom of Information (FOIA) request.  But when I put this theory to the test I discovered that the BBC is, essentially, immune to FOIA.

So I wrote to the Chairman of the BBC Trust (at the time, Rona Fairhead).  She quickly replied, saying that the Trust can’t intervene unless there is a complaint.  So she passed my letter on to the newsroom and eventually I heard from Nick Sutton who is an editor there.

Nick immediately plopped a bombshell into my lap.

The BBC does not have and never did have the data sets for their award-winning polls.

Studio shot of a handsome man with a confused expression

To my amazement, BBC reporting on these Iraq public opinion polls just forwarded to its trusting public whatever ABC news told the BBC to say.

Such data journalism without data is over-the-top unethical behaviour by the BBC.

However, you can’t hide data that you don’t have so the ethics issues raised here fall outside the scope of Secret Data Sunday.  Consequently, I’ll return to the data journalism issues later in a middle-of-the-week post.

Here I just finish by returning to my failed FOIA.

Why didn’t the BBC respond to my FOIA data request by simply saying that they didn’t have the data?  Is it that they wanted to hide their no-data embarrassment?   This is possible but I doubt it.  Rather, I suspect that the BBC just responds automatically to all FOIA’s by saying that whatever you want is not subject to FOIA because they might use it for journalistic or artistic purposes.  I suspect that they make this claim regardless of whether or not they have any such plans.

To British readers I suggest that you engage in the following soothing activities while you pay your £147 subscriber fee next year.  First, repeatedly recite the mantra “Data Journalism without Data, Data Journalism without Data, Data Journalism without Data,…”.  Then reflect on why the BBC is exempt from providing basic information to the public that sustains it.

 

Secret Data Sunday – BBC Edition Part 1

If you have spent any time on this blog you know that D3 Systems, together with KA Research Limited, fielded a lot of polls in Iraq during the occupation and that the ones I’ve managed to analyze show extensive evidence of containing fabricated data.

Some such polls were commissioned by ABC news and won big awards.  But ABC news and their pollster (Gary Langer) refuse to share their data.  This is a pretty good indication that they are well aware of the rot in their house.

It turns out that ABC news was not the sole sponsor of the series of polls in questions.  The BBC was a cosponsor.  So I figured that rather than beating my head against the wall with ABC and Gary Langer I would try with the BBC.

Sadly, it turns out that the BBC stone wall is just as solid as the ABC-Langer one.  In fact, the BBC was so stout in hiding the truth that I’ll need multiple posts to cover their reaction to the news that they are distorting the historical record on the the Iraq war.

So let’s get started.

My first try was a Freedom of Information request to the BBC asking for the data.  The one thing I learned from this denied request is that the BBC is pretty much immune to FOIA.  All they have to do is say that they plan to use the thing you want for artistic or journalistic purposes and they are done.  They don’t have to actually use what you want for such purposes – it is enough to just claim that they have a vague intention of doing so.

Below I reproduce the BBC letter which also pretty much reproduces my request.  (The formatting came out a little weird here but it should be readable.)

 

British Broadcasting Corporation Room BC2 A4 Broadcast Centre White City Wood Lane London W12 7TP
Telephone 020 8008 2882

Email foi@bbc.co.uk

Information Rights

bbc.co.uk/foi bbc.co.uk/privacy
Professor Michael Spagat

Via email: M.Spagat@rhul.ac.uk

4th May 2016
Dear M Spagat,

Freedom of Information request – RFI20160727
Thank you for your request to the BBC of 5th April 2016, seeking the following information under the Freedom of Information Act 2000:

I would like to request the datasets from six opinion polls conducted in Iraq for which BBC was a sponsor. I list them below together with links that may be helpful. The list is taken from the web site of ABC news but the BBC is a sponsor on all these polls and must have the original datasets. I want to be clear that I am asking for the detailed datasets, not just tables of processed results. If it isuseful I could send a similar dataset. But what I’m asking for should be the form in which the contractor provided the data to the BBC in the first place.

Thank you very much for your cooperation.

Here is the list:
2009
Field dates: Feb. 17 – 25, 2009
Details: 2,228 interviews via 446 sampling points, oversamples in Anbar province, Basra city, Kirkuk city,
Mosul and Sadr City in Baghdad.
Media partners: ABC/BBC/NHK
Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul
Analysis
Interviewer journal
Photo slideshow
Chart slideshow
PDF with full questionnaire
2008
Field dates: Feb. 12 – 20, 2008
Details: 2,228 interviews via 461 sampling points, oversamples in Anbar province, Basra city, Kirkuk city, Mosul and Sadr City in Baghdad. Media partners: ABC/BBC/ARD/NHK Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul Analysis Interviewer journal Photo slideshow Chart slideshow PDF with full questionnaire

2007

Field dates: Aug. 17-24, 2007 Details: 2,212 interviews via 457 sampling points, oversamples in Anbar province, Basra city, Kirkuk city and Sadr City in Baghdad Media partners: ABC/BBC/NHK Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul, Turkey. Analysis Interviewer journal Photo slideshow Chart slideshow PDF with full questionnaire

2007

Field dates: Feb. 25-March 5, 2007 Details: 2,212 interviews via 458 sampling points, oversamples in Anbar province, Basra city, Kirkuk city and Sadr City in Baghdad Media partners: ABC/USA Today/BBC/ARD Field work: D3 Systems of Vienna, Va., and KA Research Ltd. of Istanbul Analysis Interviewer journal and here. Photo slideshow PDF with full questionnaire

2005

Field dates: Oct. 8-Nov. 22, 2005 Details: 1,711 interviews via 135 sampling points, oversample in Anbar province Media partners: ABC/BBC/NHK/Time/Der Spiegel Field work: Oxford Research International Analysis Photo slideshow PDF with full questionnaire 2004 Field dates: Feb. 9-28, 2004 Details: 2,737 interviews via 223 sampling points Media partners: ABC/BBC/NHK/ARD Field work: Oxford Research International PDF with full questionnaire Photo slideshow
The information you have requested is excluded from the Act because it is held for the purposes of ‘journalism, art or literature.’ The BBC is therefore not obliged to provide this information to you and will not be doing so on this occasion. Part VI of Schedule 1 to FOIA provides that information held by the BBC and the other public service broadcasters is only covered by the Act if it is held for ‘purposes other than those of journalism, art or literature”. The BBC is not required to supply information held for the purposes of creating the BBC’s output or information that supports and is closely associated with these creative activities.1
The limited application of the Act to public service broadcasters was to protect freedom of expression and the rights of the media under Article 10 European Convention on Human Rights (“ECHR”). The BBC, as a media organisation, is under a duty to impart information and ideas on all matters of public interest and the importance of this function has been recognised by the European Court of Human Rights. Maintaining our editorial independence is a crucial factor in enabling the media to fulfil this function.
That said, the BBC makes a huge range of information available about our programmes and content on bbc.co.uk. We also proactively publish information covered by the Act on our publication scheme and regularly handle requests for information under the Act.

Appeal Rights
The BBC does not offer an internal review when the information requested is not covered by the Act. If you disagree with our decision you can appeal to the Information Commissioner. The contact details are: Information Commissioner’s Office, Wycliffe House, Water Lane, Wilmslow SK9 5AF. Tel: 0303 123 1113 (local rate) or 01625 545 745 (national rate) or see https://ww.ico.org.uk/ .
Please note that should the Information Commissioner’s Office decide that the Act does cover this information, exemptions under the Act might then apply.

Yours sincerely,
BBC Information Rights
1 For more information about how the Act applies to the BBC please see the enclosure which follows this letter. Please note that this guidance is not intended to be a comprehensive legal interpretation of how the Act applies to the BBC.

Freedom of Information
From January 2005 the Freedom of Information (FOI) Act 2000 gives a general right of access to all types of recorded information held by public authorities. The Act also sets out exemptions from that right and places a number of obligations on public authorities. The term “public authority” is defined in the Act; it includes all public bodies and government departments in the UK. The BBC, Channel 4, S4C and MG Alba are the only broadcasting organisations covered by the Act.

Application to the BBC
The BBC has a long tradition of making information available and accessible. It seeks to be open and accountable and already provides the public with a great deal of information about its activities. BBC Audience Services operates 24 hours a day, seven days a week handling telephone and written comments and queries, and the BBC’s website bbc.co.uk provides an extensive online information resource.
It is important to bear this in mind when considering the Freedom of Information Act and how it applies to the BBC. The Act does not apply to the BBC in the way it does to most public authorities in one significant respect. It recognises the different position of the BBC (as well as Channel 4 and S4C) by saying that it covers information “held for purposes other than those of journalism, art or literature”. This means the Act does not apply to information held for the purposes of creating the BBC’s output (TV, radio, online etc), or information that supports and is closely associated with these creative activities.
A great deal of information within this category is currently available from the BBC and will continue to be so. If this is the type of information you are looking for, you can check whether it is available on the BBC’s website bbc.co.uk or contact BBC Audience Services.
The Act does apply to all of the other information we hold about the management and running of the BBC.

The BBC
The BBC’s aim is to enrich people’s lives with great programmes and services that inform, educate and entertain. It broadcasts radio and television programmes on analogue and digital services in the UK. It delivers interactive services across the web, television and mobile devices. The BBC’s online service is one of Europe’s most widely visited content sites. Around the world, international multimedia broadcaster BBC World Service delivers a wide range of language and regional services on radio, TV, online and via wireless handheld devices, together with BBC World News, the commercially-funded international news and information television channel.
The BBC’s remit as a public service broadcaster is defined in the BBC Charter and Agreement. It is the responsibility of the BBC Trust (the sovereign body within the BBC) to ensure that the organisation delivers against this remit by setting key objectives, approving strategy and policy, and monitoring and assessing performance. The Trustees also safeguard the BBC’s independence and ensure the Corporation is accountable to its audiences and to Parliament.
Day-to-day operations are run by the Director-General and his senior management team, the Executive Board. All BBC output in the UK is funded by an annual Licence Fee. This is determined and regularly reviewed by Parliament. Each year, the BBC publishes an Annual Report & Accounts, and reports to Parliament on how it has delivered against its public service remit.

Secret Data Sunday – Gary Langer Edition

Last Sunday I shared an unanswered email I had sent to the Senior Vice President for Editorial Quality at ABC news.  The email gives a self-contained account of the overall context behind my data request, but I’ll take another pass here just to be as clear as possible.

There were a remarkable number of opinion polls conducted in Iraq during the US occupation.  Many of these were fielded by D3 Systems working with KA Research Limited.  Steve Koczela and I analyzed some of these surveys and found extensive evidence of fabricated data.  We wrote up our findings and asked for comments from interested parties.  D3 and Langer Research Associates then threatened to sue us rather than constructively engaging.  (See this, this and this.)

It’s clear that Langer Research Associates reacted so furiously because Gary Langer did a series of D3-KA Iraq polls for ABC  that won an Emmy Award plus the Policy Impact Award from the American Association for Public Opinion Research.  So he has a lot at stake.

Moreover, the write ups of these ABC polls show that the ABC data display some of the same patterns that Steve and I found in other D3-KA-Iraq polls.  One of the big ones is  opinion unanimity in certain governorates, including Anbar, that is more characteristic of robots than it is of human beings.  With this in mind, check out the highlighted text below.

^2284C743C86CC164FCB2B2EF819738398CF6E4E396A18B028B^pimgpsh_fullsize_distr

^E277C881426EB61DB031A34F3791226CA4761A05985A3642E9^pimgpsh_fullsize_distr

Given this background it is, perhaps, not surprising that D3 and Langer went for a legal choke-slam rather than for serious discussion.  Nevertheless, it is disappointing that these research organizations place so little value on the truth.  Thus, there really must be an outside examination of the micro data from ABC’s public opinion polling in Iraq.

I requested the data from Mathew Warshaw of D3 Systems.  He directed me to ABC News.  But, as we know, ABC News ignored my data request.  I also tried Gary Langer who  ignored me at first but finally wrote back on my latest attempt.

This is what I wrote to Langer.

Gary,

This is an opportune moment to renew my data request for the surveys you conducted in Iraq using D3 Systems and KA Research Limited.  You did not reply to my last request.

You abdigate your responsibility to the truth and violate principles of transparency by hiding your data and trying to shut down discussion of your work.

Mike Spagat

This is his reply.

Jeez, you really know how to sweet talk a guy, don’t you?

Extra points for “abdigate.”

OK, I accept full responsibility for misspelling abdicate…..abdicate, abdicate, abdicate, abdigate  gah! dammit….

I’m less apologetic about not being sweeter about my request.  Maybe being sweet is better than not being sweet but, in the end, he should live up to his responsibilities whether or not people talk to him sweetly.

Strangely this isn’t the end of the story but you’ll have to come back next Sunday for more.

Secret Data Sunday – ABC News (in the US) Stonewalls over their Dubious Iraq Public Opinion Polls

Below is an email that I sent to Kerry Smith, the Senior Vice President for Editorial Quality at ABC news, back in November of 2016.

She did not reply..

 

Dear Ms. Smith,

I am a professor of economics specialized in the quantitative analysis of armed conflict.  I have a big body of work focused on data quality issues that arise during data collection in conflict zones, especially survey data.

Back in 2011 I wrote a paper with Steven Koczela, now a prominent pollster with MassINC Polling, that uncovered substantial evidence of fabricated data in polls fielded in Iraq by D3 Systems.  We sent our paper to various interested parties for comments, including Mathew Warshaw of D3 Systems and Gary Langer who had just moved from ABC to found Langer Associates.  We included Mr. Langer in the circulation list because ABC news had used D3 Systems for a series of polls in Iraq that now required urgent re-evaluation.

D3, backed by Langer Associates, responded by threatening to sue me and Mr. Koczela.  See this, this and this.   My university has supported me against this censorship attempt but, unfortunately, Mr. Koczela felt that he could not defend himself and signed an agreement to keep his mouth shut about this particular piece of work.  (This why only my name appears on the first link above.)  Eventually, the legal threat disappeared when I wrote to Mr. Warshaw asking him explain what, specifically, he objected to in our analysis.  He did not reply.

To his credit Mr. Koczela continued working on this issue, unearthing a large number of datasets for opinion polls conducted in Iraq by D3 Systems and other polling companies.  These have provided remarkably strong evidence of data fabrication already.  For example, see this eye-popping analysis.

Many of the D3 Iraq surveys that I now have were conducted for the US State Department.  Mr. Koczela made the State Department aware of the problem at some point and they hired Fritz Scheuren, a former president of the American Statistical Association to investigate.  His analysis confirmed the fabrication problem using an analysis rather different from mine.  Unfortunately, Dr. Scheuren signed a nondisclosure agreement but I believe he would confirm in general terms the main gist of this work and he could also give you an authoritative opinion on my analysis.  (scheuren@aol.com)

Notice that after the Huffington Post article Langer Associates did post a response to my 2011 paper.   This is, however, exceptionally weak as I explain in these articles.  Langer Associates have not addressed the new evidence that has emerged since Mr Koczela’s FOIA either.

I emailed Mr. Langer for the data from the ABC Iraq polls but he did not reply.  I asked Mr. Warshaw for the same data and he referred me to ABC news.  I am now requesting the data from you.

 At the risk of belabouring the obvious, I note that people with strong intellectual cases to make do not start by threatening to sue and finish by withholding their data.

Most importantly, ABC needs to take action to correct the historical record of the Iraq war.  These polling numbers are all over the web sites of ABC news and its partner organizations in these polls.  This work must be retracted.

It is, of course, your journalistic obligation to correct the historical record but, at the same time, I think it’s to your advantage to do so.  Fixing this problem would demonstrate a strong commitment to quality and accuracy.  I doubt you would even lose your Emmy Award.  Surely you won’t be punished for pursuing the truth wherever it leads.  I will do anything I can to help in this regard.

I suggest that we meet to discuss these issues further.  I would be happy to fly to New York at my own expense for this purpose.  Alternatively, we could talk by phone, skype or some other technology.

Sincerely,

 

Professor Michael Spagat

Head of Department

Department of Economics

Royal Holloway College

University of London

Egham, Surrey TW20 0EX

United Kingdom

m.spagat@rhul.ac.uk

+44 1784 414001 (W)

+44 1784 439534 (F)

 

Blog:  https://mikespagat.wordpress.com/

War, Numbers and Human Losses: The Truth Counts

Special Journal Issue on Fabrication in Survey Research

The Statistical Journal of the IAOS has just released a new issue with a bunch of articles on fabrication in survey research, a subject of great interest for the blog.

Unfortunately, most of the articles are behind a paywall but, thankfully, the overview by Steve Koczela and Fritz Scheuren is open access.  It’s a beautiful piece – short, sweet, wise and accurate.  Please read it.

Here are my comments.

Way back in 1945 the legendary Leo Crespi stressed the importance of what he called “the cheater problem.”  Although he did this in the flagship survey research journal, Public Opinion Quarterly, the topic has never become mainstream in the profession.  Many survey researchers seem to view the topic of fabrication as not really appropriate for polite company, akin to discussing the sexual history of a bride at her wedding.  Of course, this semi taboo is convenient for cheaters.  Maria Konnikova has a great new book about confidence artists.  Much in the book is relevant to the subject of fabrication in survey research but one point really stands out for me; a key reason why the same cons and the same con artists move seamlessly from mark to mark is that each victim is too embarrassed  to publicize his/her victimization.  276365-smiley 4

Discussions of fabrication that have occurred over the years have almost always focused on what is known as curbstoning, i.e., a single interviewer making up data. (The term comes from an image of a guy sitting on a street curb filling out his forms.)  But this is just one type of cheating and one of the great contributions of Koczela and Scheuren’s  journal edition and the impressive series of prior conferences is that have substantially expanded the scope of the survey fabrication field.  Now we discuss fabrication by supervisors, principal investigators and the leaders of a survey companies.  We now know that  hundreds of public opinion surveys, especially surveys conducted in poor countries, are contaminated by widespread duplication and near duplication of single observations.  (This journal issue publishes the key paper on duplication.)

Let me quote a bit from the to-do list of Koczela an Scheuren.

It does not only happen to small research organizations with fewer resources, as was previously believed [12].  Recent instances involve the biggest and most names in the survey research business, academia and the US Government.

This is certainly true but I would add that reticence about naming names is crippling.  Yes, it’s helpful to know that there are many dubious surveys out there but guidance on which ones they are would be very helpful.

An acknowledgement by the research community that data fabrication is a common threat, particularly in remote and dangerous survey environments would allow the community to be cooperative and proactive in preventing, identifying and mitigating the effects of fabrication.

This comment about remote and dangerous survey environments fits perfectly with my critiques of Iraq surveys including this one.

Given the perceived stakes, these discussion often result in legal threats or even legal action of various types.

Ummm….yes.

…the problem of fabrication is fundamentally one of co-evolution.  The more detection and prevention methods evolve, the more fabricators may evolve to stay ahead.  And to the extent we discover and confirm fabrication, we will never know whether we found it all, or caught only the weakest of the pack.  With these truths in mind, more work is needed in developing and testing statistical methods of fabrication detection.  This is made more difficult by the lack of training datasets, a problem prolonged by a general unwillingness to openly discuss data fabrication.

Again, I couldn’t agree more.

Technical countermeasures during fielding are less useful in harder to survey areas, which also happen to be the areas where the incentive to fabricate data is the highest. Many of the recent advances in field quality control processes focus on areas where technical measures such as computer audio recording, GPS, and other mechanisms can be used [6,13].

In remote and dangerous areas, where temptation to fabricate is the highest, technical countermeasures are often sparse [9]. And perversely, these are often the most closely watched international polls, since they often represent the hotspots of American interest and activity. Robbins and Kuriakose show a heavy skew in the presence of duplicate cases in non-OECD countries, potentially a troubling indicator. These polls conducted in remote areas often have direct bearing on policy for the US and other countries. To get a sense of the impact of the polls, a brief review of the recently released Iraq Inquiry, the so-called Chilcot report, contains dozens of documents that refer, in most cases uncritically, to the impact and importance of polls.

To be honest, Koczela and Scheuren do such a great job with their short essay that I’m struggling to add value here.  What they write above is hugely pertinent to all the work I’ve done on surveys in Iraq.

By the way, a response I sometimes get to my critiques of the notorious Burnham et al. survey of deaths in the Iraq war (see, for example, here, here and here) is that it is unreasonable to expect perfection for a survey operating in such a difficult environment.  Fair enough.  But then you have to concede that we cannot expect high-quality results from such a survey either.  If I were to walk in off the street and take Harvard’s PhD qualification exam in physics (I’m assuming they have such a thing….) it would be unreasonable to expect me to do well.  I just haven’t prepared for such an exam.  Fine, but that doesn’t somehow make me an authority on physics.  It just gives me a perfect excuse for not being such an authority.

Finally, Koczela and Scheuren provide a mass of resources that researchers can use to bring themselves to the frontier of the survey fabrication field.  Anyone interested in this subject needs to take a look at these resources.

Check out my New Article at STATS.org

Hello everybody.

Please have a look at this new article that has just gone up on STATS.org.

It is a compact exposition of the evidence of fabrication in public opinion surveys in Iraq as well as the threats and debates flowing from this evidence.

My current plan for the blog is to do one follow up post on some material that was left on the cutting room floor for the STATS.org article and then move on to other stuff….unless circumstances dictate a return to the Iraq polling issue.

Have a great weekend!

More Evidence of Fabrication in D3 Polls in Iraq: Part 2

On Tuesday I provided some eye-popping comparisons on one Iraq survey fielded by D3/KA against another Iraq survey fielded by another company at exactly the same time.  In light of this evidence any reasonable person has to agree that the D3/KA data are fabricated.  Nevertheless, today I give you a different window into the same D3/KA survey.

Recall that one of the main markers of fabrication in these surveys is that the respondents to what I’m calling the “focal supervisors” have too many “empty categories”.  A response category is “empty” for a group of supervisors if it is offered as a possible choice but zero respondents actually chose it.  For example, in Part I to this series we saw that for all public services zero  respondents for the focal supervisors said that the service was “unavailable” or that availability was “very good”.  These are, therefore, both empty categories for the focal supervisors.

Langer Research Associates tried to rationalize all the empties for the focal supervisors by arguing that other supervisors also have empties.  Langer Associates also argued that Steve Koczela and I were unfair to compare the group of focal supervisors with  the group of all the other supervisors.  This is because the number of empties should be decreasing in the total number of interviews and the all-others group did more interviews than the focal group did.  Langer does have a point on this which I addressed in this post.  Here I follow up with a couple of pictures based on the same D3/KA survey discussed on Tuesday.

Each picture takes a bunch of different combinations of supervisors and for each combination plots the number of empties against the number of interviews.  The first plot graphs the data on 100 combinations of three supervisors plus the focals.  The second plot graphs the data on 100 combinations of four supervisors plus the focals.

Empties versus Interviews_three supervisors

Empties versus Interviews_four supervisors

You can see that:

1,  The number of empties is, indeed, decreasing in the number of interviews.

2.  Even after adjusting for this fact the focal supervisors still have overwhelmingly more empties than they should have, given the number of interviews they have conducted.