UNICEF Gives us lots of Great Data….and maybe a bit too much Expert Judgement

All readers of this blog should know this UNICEF web site.  Childmortality.org lets you quickly call up a mass of child mortality data for most countries in the world.

Sometimes you learn that the state of our knowledge is a mess (for example, check out Angola) but UNICEF always tries to impose some order by drawing dark blue lines through the middle of the data points.

I just wish that UNICEF was more transparent and rule-oriented about how they create these lines.  They seem to be products of both sophisticated modelling and bargaining among experts.

There is, of course, a certain appeal to the notion that UNICEF people don’t just blindly adopt the curves that their computers spit out at them.  Rather, the masters bring their voluminous knowledge and uncanny intuitions to the table.

However, there is a danger that the practice of tweaking one’s estimates with expert judgements can open the door to all sorts of prejudices and biases.  (This joint interview with Daniel Kahneman and Gary Klein is relevant to this question.)

Take a look at UNICEF’s Democratic Republic of Congo graph.

DRC

To me, the flat blue region seems strangely unmoored from the data.  So I took a crude pass at the underlying data between 1999 and 2011.  (When there were multiple observations in a single year I just averaged them and I rounded month-year observations to the nearest year.)

Here are the data together with a straight-line fit to the points:

DRC_mine

UNICEF’s flat region seems nowhere to be found.

OK, this wasn’t a randomly chosen example.  A little bird tells me that the flat part of the DRC graph will disappear for the next update of childmortality.org.

But why did UNICEF fudge the trend in the first place?  I think I may know the answer.  (Scepticism alert: I will now speculate about motives.  This is always  hazardous for those of us who can’t climb into other peoples’ minds.)

There is an NGO called the International Rescue Committee that stridently insisted for years that there was a spike in child mortality rates in the DRC that started right around the year 2000.  This claimed spike was said to be so massive as to make the DRC conflict the deadliest war since World War II.  Back in 2010 the Human Security Report showed that the IRC was wrong about this.  Could it be that some of UNICEF’s experts have used their influence to mitigate the embarrassment of their friends at the IRC for being so spectacularly wrong about the DRC?

Of course, now that you know about childmortality.org you can see for yourself (without turning to the Human Security Report) that the DRC did not suffer a huge and sustained spike in child mortality in the 2000’s.  Still, the Human Security Report deserve kudos for identifying this issue before the pile up of data that proved its point.

To be clear, child death rates are still very high in the DRC.  Moreover, many people have been violently killed, raped and forced to flee during the course of war.  It has been an awful, nasty war.   Unfortunately, other wars have been awful, nasty and bigger.

Civilians versus Combatant Watch: Ewen MacAskill edition

Here is a decent article by Ewen MacAskill reporting on a plan by Jeremy Corbyn to apologize to the Iraqi and British people over the Iraq war if he becomes Labour leader next month.

Great.

Unfortunately, the article also provides a perfect example of the shoddy practice, discussed just a few weeks ago on the blog, of blurring the distinction between  combatants and civilians.

The Iraq Body Count project puts the civilian death toll at 219,000 since the invasion that toppled Saddam Hussein, though others put it much higher. The number of British personnel killed in the war was 179 and the US 4,425.  (Note: the quote is of MacAskill, not Corbyn.)

Dear Readers, please google “Iraq Body Count” and look in the upper left-hand corner.  You will find this:

Documented civilian deaths from violence

142,939 – 162,177

Total violent deaths including combatants

219,000

In short, MacAskill presents the IBC number for civilians plus combatants as a civilians-only number.

As a side point, notice that IBC’s civilian range is for documented civilian deaths so the true number is surely higher, as the above quote implies.  Still, there is no actual measurement of violent deaths of civilians only that comes out higher than the IBC number.  The higher estimates are always of civilians plus combatants.

Citation Distortion: Part I

Apologies for the radio silence.  I went on holiday (for just one week) and then, somehow, have been desperately playing catch up ever since.

This 2009 paper by Steven Greenberg entitled “How citation distortions create unfounded authority….” strikes me as remarkably useful and important.

Suppose I write a peer-reviewed journal article  that includes a claim that “Campbell’s soup prevents breast cancer.”  I immediately drop a footnote citing seven journal articles.

Many readers will probably believe that the cited articles support the breast-cancer claim.   Sadly, nothing could be further from the truth.  In fact, it can easily turn out that none of the cited papers offer any support for my claim and some may even provide contrary evidence.

Faking people out with a blizzard of footnotes is a surprisingly effective strategy.  First of all you can intimidate many readers with your apparent erudition.  Plus people will ask themselves whether they want to invest precious time tracking down a load of footnotes.  They may figure that one or two of the citations could turn out to be flawed but is it really possible that all seven are not as advertised?

Yes, it is possible.

Have a look at pages 36-38 of this paper of mine.  This is just a sliver of a critique of the infamous Burnham et al. (2006) paper that dramatically overestimated the number of violent deaths in the Iraq conflict.  I will definitely return to this paper in future posts but for now I just want to note that there could hardly be a better example of a claim supposedly backed up by a lot of sources that don’t actually check out.

Greenberg’s study covers 242 papers with 675 citations on something incomprehensible (to me) having to do with proteins and Alzheimer’s disease.  Luckily, the only thing that matters for us is that there is a big literature pitting a  side A (my term) against a side B.

Please click on this nice summary picture of the Greenberg analysis:

F1.large

The citations of side-A partisans have a pronounced tendency to back up only side A.  Greenberg calls this “citation bias”.

OK, I know that some of you will never recover from the shock of discovering that people prefer to cite evidence that backs up their own beliefs rather than evidence that calls their beliefs into question.  I apologize for doing this to you.

Quotation-Susan-Hill-innocence-Meetville-Quotes-181873

More interesting is what Greenberg calls “citation diversion”.  This means taking a a paper that supports side B but citing it as supporting side A.  Greenberg shows that three citation diversions mushroom into a whopping 7,848 false-claim chains.

Greenberg also introduces a third category, “invention”, to cover cases such as when a cited paper says nothing about the claim it supposedly backs or when a  citation elevates a mere hypothesis in the cited paper into a fact.

Once a diversion, invention or even an honest mistake is introduced into the literature it is readily perpetuated in subsequent publications.  Researchers may not bother to trace a claim back to its original sources, especially if these lazy souls  have a stake in the claim being true.

In follow-up posts I’ll give you some nice examples from the conflict literature.

P.S. – While I was writing the present post this article appeared totally backing me up.  Take it from me.  You don’t need to bother reading it.