I’ll start with an important announcement. Steve Koczela just had success with a Freedom of Information Request to the US State Department. This means that he now has a mountain of new polling data from Iraq which he will be releasing in due course.
Some of these surveys were fielded by D3/KA, giving us a great chance to test our findings out of sample. On top of that there are some surveys fielded by another company which provides an even better opportunity to get to the bottom of what has been going on in these polls.
I couldn’t resist having a look today at a D3/KA survey from 2006.
The survey has a battery of questions on the quality of public services. I give the questions at the bottom of this post. The possible answers are: “very good”, “good”, “poor”, “very poor”, “not available” and “don’t know”. Based on previous work I predict that supervisors 36, 43 and 44 are cheaters. So I divide the sample into two pieces: the interviews of these supervisors and the interviews of all the other supervisors.
For the ones I predict for cheating the most common answer on these questions is that services are “very poor”. Not a single person says that services are “very good” or that they “don’t know”. This is strange. You’d expect that at least one of the out of 443 would go for one of these answers but let’s leave that aside. Maybe these people are all very sure that they are receiving bad services.
Much more surprising is that not a single person says that a service is “not available”. So,overwhelmingly, services are very bad or bad … but still available. This is weird. Don’t you think that at least a few of these dissatisfied customers would tick the worst box of all?
All boxes get ticked for the group of the other supervisors. These supervisors did do 1,557 interviews so you could follow the Exhaustive Review and say their fuller coverage is down to their higher numbers. In a future post I will explain why I am not convinced on this point. But let’s leave this aside as well for today.
Instead, let’s look at correlations between answers to different questions. For example, to what extent are people who are happy with their trash collection also happy with their landline service, etc.?
Here’s a list of the correlations on this battery of questions. On the left are the interviews of the predicted cheaters and on the right are the interviews or all the others.
|Predicted Cheaters||All the Others|
Look at all the perfect correlations of 1.00 for the predicted cheaters!
Every time you see a 1.00 you should hear the sound of 443 people answering questions in perfect lock step with one another. If you are slightly happier with your electricity than I am then you are also slightly happier with your water than I am…and also slightly happier about your landline…and slightly happier with your mobile, and your garbage collection….and traffic management in your area.
I didn’t make that up. All of the above variables are perfectly correlated. C’mon guys. You’re making yourselves too easy to catch.
For the supervisors not flagged in advance as likely cheaters there is never a perfect correlation between two questions. This is what we would expect in real interviews.
Your eye may have been drawn toward the very high correlation of 0.97 for the supervisors I haven’t flagged as suspicious. But this is for garbage collection versus sewage disposal. In fact, it makes sense that these two would be closely linked and the much weaker connection for the likely cheaters strikes me as further evidence that they made up their data.
Quoting again from the Exhaustive Review:
Examining expected correlations is a reasonable way to search for evidence of data fabrication; it’s very difficult for a fabricator to anticipate relationships among variables and fake data accordingly. We find, however, that the lack of correlations of the type that Koczela and Spagat document appears again to be an artifact of their groupings of supervisors. (We also note that we have examined many more correlations, 96 in total, than Koczela and Spagat report.)
I have to agree with the exhaustive reviewers here. Looking at correlations is quite a good way to uncover fabrication. Indeed, the above table is strong evidence of fabrication. However, I’m baffled by how results like these are supposed to be “artifacts” of groupings. I honestly don’t know what to make of this comment.
Of course, the above table only contains 45 correlations. With the five reported in the original paper, which the exhaustive reviewers did not attempt to explain, I’m still 46 shy of the exhaustive reviewers. I guess I’ll have to work harder.
Remember that all the analysis in this post is of a new survey not covered in my original paper. I was able to use the list of suspicious supervisors taken from the earlier paper to immediately find big correlation, and other, anomalies in a new dataset. In other words, this is an out-of-sample success, and it is an easy one at that.
Finally, here is the list of questions in the battery:
Q3a-The following services for your neighborhood over the past month have been…Water Supply?
Q3b-The following services for your neighborhood over the past month have been…Electric Supply?
Q3c-The following services for your neighborhood over the past month have been…Telephone Service (land line)?
Q3d-The following services for your neighborhood over the past month have been…Telephone Service (Mobile)?
Q3e-The following services for your neighborhood over the past month have been…Garbage Collection?
Q3f-The following services for your neighborhood over the past month have been…Sewage Disposal?
Q3g-The following services for your neighborhood over the past month have been…Conditions of roads?
Q3h-The following services for your neighborhood over the past month have been…Traffic Management?
Q3i-The following services for your neighborhood over the past month have been…Police Presence?
Q3j-The following services for your neighborhood over the past month have been…Army Presence?