The Statistical Estimates of Peru’s Truth and Reconciliation Commission are Really Bad: Part 2

Back to blogging after diverting my energy in recent weeks to putting out a few fires.

I’ll assume here that you’ve read the first post of this new series.  So you know that overfitting is a terminal problem for the Shining Path (SP) estimates published in the statistical report of Peru’s Truth and Reconciliation Commission (TRC).  According to the TRC’s own stated standard for overfitting, the models in no fewer than 47 out of the TRC’s 58 geographical strata are inadmissible.  And many of the TRC’s models are not just somewhat out of bounds; they’re egregiously overfit.

What about underfitting?

That is, does the TRC extrapolate any SP estimates from models with inadmissibly bad fits, even according to the TRC’s own stated fitting standards? Yes they do.  The impact of this violation is small, however, since the TRC extrapolates from a really badly fitting model only in stratum 32.  Yet stratum 32 is quite an interesting case.  So I devote the rest of this post to it.

Recall that Silvio Rendon and the TRC use different methods to estimate SP-caused deaths.  The details need not concern us here.  The salient point for this post is that Daniel Manrique-Vallier and Patrick Ball (MVB), two authors of the original TRC report, assert that Rendon’s method is biased toward underestimation.

MVB do head-to-head comparisons with Rendon’s SP estimates in 9 strata and MVB’s published numbers place Rendon’s estimates below the TRC’s in 8 out of these 9 strata.  These results are broadly consistent with the idea that Rendon’s method is biased downward although they are equally consistent with the idea that the TRC’s method is biased upward.

It turns out, however, that Rendon’s SP estimates are higher than the TRC’s in stratum 32.  Here are the numbers:

TRC – 328

Rendon’s preliminary estimate – 751  (before multiple imputation)

Rendon’s main estimate – 877  (after multiple imputation)

Unfortunately, MVB don’t say that stratum 32 goes against the grain of their argument.  To the contrary, Figure 4 in their supplementary materials wrongly claims a tie, placing the estimates of both Rendon and the TRC just below 600.

Figure 4 also asserts that any estimate below around 330 (eye balling the graph) is impossible.  Yet their own (dubious) methodology for defining “impossibility regions” would place this boundary at 170.

I’ve argued before that all of MVB’s figures are misleading and should be corrected.  This is because they airbrush statistical uncertainty away and assume perfect accuracy, both in the underlying data and in the matching of deaths across sources.  But figure 4 reaches a new level of wrong.

My first series argued that MVB’s defense of the TRC work is weak.  The discovery of mistakes can only diminish further its persuasiveness.

Finally, please keep your eye on the ball which is the TRC report itself.  Here are the main take-home points so far in this series.

  1. The overfitting problem – this is sufficient to dismiss the whole SP portion of the TRC report..
  2. The underfitting problem – this makes the TRC’s problems a little bit worse

Of course, MVB should correct the mistakes in their stratum 32 figure.  Beyond that I wonder whether there are other mistakes out there waiting to be discovered.

 

Advertisements

The Statistical Estimates of Peru’s Truth and Reconciliation Commission are Really Bad: Part 1

Silvio Rendon’s powerful rejoinder on the statistical work of Peru’s Truth and Reconciliation Commission (TRC) is finally out.

This is a big development so I’m starting a brand new series to cover it.  Of course, this new series is related to the earlier one.   But I’ll strive to make the present series self-contained so you can start here if you want to.  

The sequence of events leading up to the present moment goes as follows.

I.  The TRC of Peru publishes a statistical report that makes two surprising claims.

  1.  Nearly 3 times as many Peruvians were killed in the war, 1980 – 2000, than the combined efforts of human rights NGO’s, Peru’s Ombudsman office and the TRC were able to document on a case-by-case basis.
  2.  The Shining Path (SP) guerrillas killed more people than the State did – reversing the pattern of the combined list of documented deaths, which formed the basis for the TRC’s statistical work.

II.  Silvio Rendon publishes a critique of the TRC’s statistical work.  He also proposes new estimates which, compared to the TRC’s estimates, increase the number of deaths attributed to the State, decrease the numbers of deaths attributed to the SP and “Other” groups and decrease the total number of deaths for the war as a whole.  His estimates are inconsistent with the TRC’s surprising conclusions.

III.  Daniel Manrique-Vallier and Patrick Ball (MVB), two authors of the TRC’s original statistical report, reply with a critique of Rendon’s estimates, indirectly defending their own SP estimates.

My earlier series covers the above three developments.

IV.  Now we have Rendon’s rejoinder which mostly attacks the original work of the TRC but also defends his own estimates from MVB’s critique.

I make one caveat before proceeding.  Rendon’s work is replicable but I have not tried to replicate it.   I’ll just assume here that Rendon’s claims are correct.  This is a reasonable thing to do since no one has discovered a substantive error from Rendon in the debate so far.

Rendon’s rejoinder finds grave and terminal deficiencies in the statistical report of Peru’s TRC.  There are several issues in play but today I’ll cover just the issue of overfitting, which turns out to be quite a big problem.

There is a footnote in the TRC statistical report itself that provides a good explanation for the dangers of overfitting.  The quoted text can supplement or replace the above overfitting link or, if you prefer, you can try this very short alternative explanation:

Overfitting occurs when the model fits the data too closely and therefore models only these data and none other. As the number of parameters used to fit the model approaches the number of cells in the table, all of the available information has been used for the model fitting, and none remains for the estimation. The goal is to find a model that fits reasonably well, but not so well that the same model would fail to fit different data describing the same phenomenon.

The TRC statistical report also proposes a policy to avoid overfitting, although they ignore it for their SP estimates: reject models with goodness-of-fit p values that exceed 0.5.  (Possible p values range from 0, i.e., no fit whatsoever, up to 1, i.e., a perfect fit.)

Given this policy it’s shocking to discover that the TRC bases its SP estimates on perfectly fitting models in 14 out of its 58 geographical strata.

In fact, I learned through correspondence with Silvio Rendon that these 14 cases of egregious overfitting form just the tip of an overfitting iceberg.  In a further 12 strata the TRC models are so close to perfect fits that their goodness-of-fit p values round up to 1.  Moreover, p values are between 0.7 and 1 for an additional 13 strata and between 0.5 and 0.7 for a further 8 strata.  In other words, according to their own stated standards the TRC estimated SP-caused deaths off of overfit models in 47 out of their 58 strata.  Most of these models are badly overfit.

To summarize, based on overfitting issues alone we should bin most of the TRC’s SP estimates.

download

And overfitting is just one of the problems that plagues these SP estimates.  Stay tuned for more in part 2 of this series.

PS – You may want to check out a parallel series on the blog about the perils of matching deaths and events across lists.  It focuses on Iraq data but co-author Josh Dougherty and I discuss connections that are relevant for the Peru discussion.

 

 

 

Important New Violent Death Estimates for the War in Peru with Implications Beyond just Peru: Part 6

This is the latest installment in a series that considers the statistical report done for the Peruvian Truth and Reconciliation Commission (TRC), Silvio Rendon’s critique of this statistical report and a reply to Rendon from Daniel Manrique Vallier and Patrick Ball (MVB) who worked on the TRC statistical report.  The present post continues to discuss the MVB reply.

(Note that I may not resume this series until Silvio Rendon’s rejoinder is published.  Meanwhile, I’m also working with Josh Dougherty of Iraq Body Count on an offshoot post that will cover the practice and pitfalls of matching deaths across multiple lists.)

Today I’ll comment on nine figures from the MVB reply: figure 1 in the main body of the paper and figures 2-9 in the appendix.

I won’t produce any of the figures here because they are misleading and a picture is worth a thousand words.  The main features I object to are that the figures  substitute lower (preliminary) stratum-level estimates for Rendon’s main estimates and suppress the uncertainly surrounding these estimates.  Moreover, MVB portray some of these these lowered point estimates as falling within an “impossibility region,” a characterization which further assumes that MVB’s matching of deaths across sources was perfectly executed on fully accurate data.

Nevertheless, the figures do convey some interesting simulation-based information that addresses the question of when a direct estimation approach outperforms MVB’s indirect one and vice versa.  Each of the nine figures uses data from a stratum for which one can directly estimate Shining Path (SP) deaths.  (There are nine such strata before multiple imputation and two more, not covered by the figures, after multiple imputation.)

The X axis in each picture represents all the possible true values for the number of SP-caused deaths (with the true values indexed by N).  MVB perform simulations that estimate the number of SP-caused deaths many times for each stratum and for each N using both direct capture-recapture and MVB’s indirect capture-recapture methodology.  MVB then calculate the deviation of each estimate from the underlying true value, square these deviations (so that negative deviations do not cancel out positive ones) and take the mean of these squared deviations across all simulation runs for each value of N.  Finally, they graph these “mean-squared errors” for each method and each N in all nine strata.

For eight out of the nine strata the direct method outperforms (i.e., has lower mean-squared errors) the indirect method for values of N below some critical value and the the indirect method outperforms the direct one above this same critical value.  (For one stratum the reverse is true but there is never a big difference between the two methods in this stratum so this doesn’t seem to matter much.)  For three strata the critical value for which the best performing method switches from direct to indirect is inside of MVB’s “impossibility region”.

In eight out of the nine strata the indirect method outperforms the direct method when the true number of people killed by the SP is set equal to the estimate that the TRC actually made for that stratum (using the indirect method).  Essentially, this rather unsurprising result says that the indirect method performs well in simulations of cases for which the TRC’s indirect estimate  delivered a correct result.  And the indirect method also performs well when the TRC’s estimate is not spot on but still reasonably close to being correct.

The direct method tends to outperform the indirect one in simulations that start from the assumption that the direct estimate is correct.  Nevertheless,  in three out of the nine strata the indirect method actually wins this contest.

Overall, these simulation results tend to favor the indirect method over the direct one, especially when the true numbers are assumed to be rather high.

That said,  the direct method in the simulations does not match Rendon’s main method because, again, MVB omit the multiple imputation step of Rendon’s procedures.  Incorporating multiple imputation should shift the balance back towards Rendon.  And, again, I would like to see a similar exercise performed on Rendon’s alternative approach that covers the whole country with ten strata.

Here’s one last point before I sign off.  As of now, the MVB reply is still just a working paper, not yet published in Research and Politics.    The main advantage of posting a working paper before publication is that you can respond to feedback.  Thus, it would be great and appropriate for MVB to take advantage of the remaining time window by purging the misleading material about impossible point estimates without uncertainty intervals from the published version of their paper.  (See post 4 and post 5 of this series in addition to the present one for further details.)  This move would help lead us toward more fruitful future discussions.

Important New Violent Death Estimates for the War in Peru with Implications Beyond just Peru: Part 5

I’ll start this post by reacting to some interesting comments to part 4 of this series which was, you may be surprised to learn,  preceded by part 1, part 2 and part 3.  I’ll assume that readers have some familiarity with these posts but I’ll also try to go slowly and remind readers of things we’ve discussed before.

Recall that there is a statistical report done for the Peruvian Truth and Reconciliation Commission (TRC), Silvio Rendon’s critique of this statistical report and a reply to Rendon from Daniel Manrique Vallier and Patrick Ball (MVB) who worked on the TRC statistical report.

Let’s focus first on the data.

The TRC statistical report and Rendon’s critique are both based on what I’ll  call “the original data,” which consists of 3 (after some consolidation) lists containing a total of 25,000 unique (it is claimed) deaths, many appearing on multiple lists.

There are several issues concerning the original data:

First, only summary information from the original data is in the public domain.  The following table from the TRC statistical report shows the form of the publicly available original data:

We can see, for example, that there are 627 people recorded nationwide as killed by the State (“EST”) who appear on the list of the CVR (the TRC itself) and the DP (Defensoria del Pueblo) but not on the ODH (NGO’s) list.

There are 59 such tables, one for each geographical stratum.  Each one looks  like the above table but, of course, with smaller numbers.  Both the TRC and Rendon base their statistical work on these tables.

The problem is that these 59 tables alone do not allow us to examine the underlying matching of deaths across lists that they summarize.  Matching is a non-trivial step in the research that involves a lot of judgment.  I will examine the matching issue in an upcoming post.  Suffice to say here that until this step is opened up we are not doing open science.

To be fair, it appears possible for at least some researchers to obtain the detailed data from the Peruvian government and perform their own matching.  According to Patrick Ball:

“People with access to the detailed TRC data” is not an intrinsic category of people: it’s just people who have asked nicely and persisted (sometimes, like us, over several years), until they got access to the data. It seems to me that with sensitive data, obtaining the relevant information is incumbent upon the researcher: Rendon could have inquired of the Peruvian Ombudsman office to get the TRC data. It’s not secret, it just requires a bit of work to obtain, and he chose not to do so.

I don’t like this.  The data should simply be available for use.  Patrick may be right that, effectively, it’s open to all nice and persistent people.  But the data should also be available to mean and non-persistent people as well.

Let’s move on to a few observations on the MIMDES data, the detailed version of which is in the public domain.  (Apparently it’s not online right now but has been in the past and hard copies can be obtained.).

First, the public availability of the MIMDES data undermines excuses for forcing researchers to jump through hoops for the TRC data.  They are both detailed lists of people killed in the war.  These list are both held by the Peruvian government.  Why is it OK to circulate one list while requiring researchers to be nice and persistent for the other?

Second, I know nothing about the data collection methodology for the MIMDES data.  OK, perhaps I should obtain and study the MIMDES reports.  But the MVB reply paper introduces the MIMDES data into this whole discussion so they should describe the MIMDES data collection methodology in their paper.  (They also should have described the data collection methodologies for the lists used in the TRC’s statistical report.)  But the MIMDES methodology seems particularly important since Patrick Ball, in his comments on this blog, urges us to treat the MIMDES sample as more or less representative of all deaths in the war.  I would need to know something about how MIMDES performed its work before entertaining such a notion.

MVB have matched the MIMDES deaths against the TRC’s deaths and the resulting figures are central to their reply to Rendon’s critique.  For three reasons, however, I recommend that we take these merged TRC-MIMDES figures with a grain of salt, at least for now.  First, MVB don’t explain how they do the matching.  Second, they say their work is unfinished.  Third, it is difficult at present for anyone to match independently since the TRC data are not really open.  (Remember that you need the detailed TRC data in order to match it against the MIMDES data.)

That said, for the rest of the post I’ll take the numbers from MVB’s TRC-MIMDES merge as given just as I’ve done in my earlier posts.

Patrick and Daniel especially emphasize one point in their separate comments on post number 4 (in which they focus exclusively on Rendon’s Shining Path (SP) estimates).  Recall that Rendon’s main estimate starts with estimates from just the geographical strata that allow for direct estimation (after multiple imputation) and then uses spatial extrapolation (kriging) to extend these estimates to the whole of Peru.  But, MVD argue, the estimates in the selected strata are biased downwards because the fact that there’s rich enough data to do direct estimation already suggests that there are relatively few undocumented deaths left to discover in these strata.  Conversely, MVB  suggest, the strata where data is too sparse for direct estimation probably contain relatively many undocumented deaths.

This is a creative idea with some potential but I think that, if it exists, its effect is probably small. One of Rendon’s alternative estimates cuts Peru up into just 10 regional strata which cover the entire country rather than the 59 more localized strata in MVB’s stratification scheme.  This 10-stratum estimate is not subject to MVB’s selection bias argument.  The SP estimate in this case is around 1,000 deaths more than Rendon’s main estimate (which  requires strata selection and spatial extrapolation).  So perhaps MVB have identified a real bias although, if so, it seems to be a small one.  There are, of course, multiple changes when we move from Rendon’s main estimate to the 10-stratum one.  But MVB need their suggested bias to be huge in order to produce the 10,000 plus additional deaths required to make their TRC estimate look accurate.  The above comparison doesn’t suggest a bias effect of this order of magnitude.

The frailty of MVB’s statum-bias critique is exposed by the games they play to portray Rendon’s direct stratum estimates for the SP as systematically lower than the SP numbers in their merged TRC-MIMDES dataset.

They begin by deleting the multiple imputation step of Rendon’s procedures.  Daniel Marique Vallier explains:

Thus, showing that the application of capture-recapture to those strata leads to contradictions, automatically invalidates all the rest of the analysis. This is because those more complex estimates depend on (and amplify) whatever happens with the original 9; you can’t extrapolate from strata that are themselves inadequate. That’s what we have shown. Specifically, we have shown that the application of capture-recapture to those 9 strata (Rendon’s necessary condition) results in estimates smaller than observed counts (contradiction). This means that the basic premise, that you can use those strata as the basis for a full blown estimation, is faulty (modus tollens). Anything that depends on this, i.e. all the rest of the conclusions, is thus similarly faulty (contradiction principle).

This makes no sense.  Applying multiple imputation before capture recapture increases the estimate in every stratum.  These higher estimates then feed through the spatial extrapolation to increase the national estimates.  Deleting the multiple imputation step decreases the estimates at both the stratum and national levels.  Manrique Vallier argues, in effect, that doing something to increase Rendon’s estimates can only compound the problem of his estimates being too low.  This is like saying that drinking a lot of alcohol makes it dangerous to drive so (modus tollens) sobering up can only make it more dangerous to drive.

Next MVB try to dismiss all of Rendon’s work based on their claim that some of Rendon’s point estimates (which they have lowered) are below their merged TRC-MIMDES numbers.  Simultaneously, MVB apply a far more lenient standard to evaluate Patrick Ball’s Kosovo work (see MVB’s comments on post 4).  For Kosovo, they argue, it’s not a problem for most estimates to be below documented numbers as long as the tops of Ball’s uncertainty intervals exceed these numbers.  Moreover, it’s even OK for this criterion to fail sometimes as long as Ball’s broad patterns are correct.  I actually agree with these standards but consistency requires applying them to Rendon’s Peru work as well.

That said, Patrick Ball’s last comment makes a good point about the serious challenges to data collection in Peru.  By contrast, it’s easier to collect war-death data in Kosovo and lots of resources were devoted to doing just this..  So the true numbers in Peru might be substantially larger than MVB’s TRC-MIMDES ones.  I agree that this is possible but I would want to know a lot more about the various data collection projects in Peru before taking a strong stand on this point.

Finally, I return to Rendon’s ten strata estimate which appears immune to all the criticism contained in MVB’s reply.  The central estimate is about 2,000 deaths above MVB’s TRC-MIMDES national count for SP deaths, leaving considerable room to accommodate the discovery of more deaths, especially in light of the uncertainty interval.  That said, it would be interesting to see stratum by stratum comparisons with TRC-MIMDES to see whether there are any substantial discrepancies.

To summarize, perhaps Rendon’s SP estimates are somewhat low.  But MVB’s reply does little to undermine Rendon’s critique.beyond this minor observation.

 

 

 

 

 

 

 

 

 

 

Important New Violent Death Estimates for the War in Peru with Implications Beyond just Peru: Part 4

This is the fourth post in a series on the statistical report of the Peruvian Truth and Reconciliation Commission (TRC) , the critique of that report published by Sylvio Rendon, the reply to Rendon from two authors of the statistical report and, eventually, Rendon’s rejoinder which has not yet been published.  My earlier posts are here, here and here.

Note – I just noticed that in my last post I reversed the order of the authors on the reply paper.  I will fix this.  From now on the order will be  Daniel Manrique-Vallier and Patrick Ball (MVB).

In their abstract MVB write:

We first show that his most important result, an alternative estimate of the mortality due to the Maoist guerrillas of Shining Path is lower than existing observed data and is therefore impossible.

MVB elaborate in the introduction that:

There are three bases for our rejection of Rendon’s methods and findings: first, his estimates are inconsistent with observed data. By combining the data used by the TRC with data published by the Peruvian government between 2004 and 2006 (MIMDES 2004, 2006), we see that Rendon’s estimates for SLU [the Spanish acronym for Shining Path] are, in most strata and in the aggregate lower than the number of observed SLU victims-without considering victims who continue to be undocumented.

In short, MVB integrate new data onto the lists of documented deaths that they used in their original TRC report and that Rendon used in his critique.  MVB claim that these new integrated accounts exceed Rendon’s aggregate Shining Path (SP) estimate as well as his estimates in most of the strata for which he did direct estimates.  There is some merit from this line of argument but, as presented, it is weak and even disingenuous.

Recall that Rendon’s main aggregate SP estimate is roughly 18,000 with a 95% uncertainty interval of about 15,500 to 21,000.  For convenience I repeat Rendon’s main table here:

MVB’s new number for documented SP-caused deaths, integrating the new data, is 17,687.  Here is MVB’s main table:

So Rendon’s main estimate is actually higher than 17,687 the new number of observed SP victims (according to MVB), not lower as MVB claim.  Moreover, the top of Rendon’s uncertainty interval is nearly 20% above 17,687, leaving much room for more SP-caused deaths to be discovered without really creating problems for Rendon’s estimate.

At this point I’m sure you’re wondering what the heck is going on here?  The answer is that MVB’s table does cite an actual Rendon SP estimate, just not his main one.  Recall that Rendon’s main estimate is based on three methodological elements: multiple imputation to assign perpetrators when these are listed as “unknown” in the TRC databases, direct estimation in the strata that allow it after multiple imputation and kriging to extend the estimate to the remaining strata.  MVB cite an estimate that used just direct estimation and kriging, not a full estimate with multiple imputation prior to direct estimation and kriging.

It’s fine for MVB to cite the direct-kriging estimate and it’s interesting to compare this one with their new figure for documented deaths.  The problem is that MVB stop at just the direct-kriging estimate and then hinge their case against Rendon on the fact that this incomplete estimate is below a cut-off value.  But multiple imputation is part of Rendon’s methodology and the complete estimate, including this step, exceeds the cut-off value..

Analogously, it’s fine to point out that Watford needed a late goal to beat Leicester City in a game that might well have ended a draw.  But it’s not OK to just ignore that late goal and claim that the game was drawn.

Alert readers may remember that Rendon offered two other SP estimates that also incorporated multiple imputation.  First, there is a fixed effects SP estimate that comes out to around 17,500.  MVB could argue, in their terms, that this estimate is “impossible” although it’s only below their new documented number by around 150 deaths.  Moreover, this estimate is surrounded by an uncertainty interval of around plus or minus 4,000 deaths and it hardly seems appropriate to focus all attention on the central estimate.  Still, it’s true this estimate does include multiple imputation and is below the new documented SP count. .

The other Rendon SP estimate comes from dividing Peru into only ten strata so that direct capture-recapture estimates can be performed for each perpetrator in each stratum.  This SP estimate is around 19,500 plus or minus 6,000, well above the new documented figure even before we factor in the uncertainty interval.

In short, there is no basis to dismiss Rendon’s methods based on the idea that they lead to impossible aggregate findings for the SP.

What about stratum by stratum estimates?  MVB’s table (above) gives figures for one stratum that place Rendon’s central estimate nearly 500 deaths below MVB’s new figure, a deficit of more than 40%.  This is interesting and I would  like to know more about this stratum but this observation has limited importance.  First of all, it is only one stratum, just a small piece of Rendon’s aggregate estimate.  Second, this stratum appears to be the most favorable one for MVB, although they characterize it as merely illustrative.  Third, MVB do not incorporate multiple imputation into the numbers they place in the table although doing so would increase the Rendon numbers. Still, I agree that Rendon’s estimate is probably below the true number for this stratum, although I am not shocked to see a statistical estimate that turns out to be below a true value.

MVB further claim in the second quote above that most of Rendon’s direct stratum estimates for SP are below their new figures for documented deaths.  Maybe this is true, but I will reserve judgment on this claim until I see stratum by stratum comparisons that incorporate multiple imputation and uncertainty intervals.

Here is one final point for today.  A few years ago I worked with Patrick Ball on an evaluation of the database of Kosovo Memory Book (KMB).  I produced a report in which I argued that there is great consistency between KMB’s list of documented deaths for the war in Kosovo and two separate statistical estimates of the same thing (covering somewhat different time periods).  One set of the statistical estimates were capture-recapture ones done by Patrick Ball and co-authors.  In many strata these estimates are below KMB’s numbers for documented deaths.  This happens in the West (see the table below) and for all time periods for which the black curve is above the blue one in figure 2 below.  I have never viewed these differences as a problem and,

indeed, I consider this work to be quite a success for Ball, KMB and the method of capture-recapture.  Yet the logic of the MVB reply seems to require that we reject Ball’s methods and findings on Kosovo.  After all, some of his central estimates are impossible.

OK, I still haven’t covered the whole reply yet but that’s enough for today.

 

 

 

Important New Violent Death Estimates for the War in Peru with Implications Beyond just Peru: Part 3

This post continues a series that started here and continued here.  The earlier posts covered Sylvio Rendon’s critique of the statistical report of Peru’s Truth and Reconciliation Commission (TRC).  Rendon argues that the two striking and central statistical claims made by the TRC are both wrong.  These claims are:

  1.  The true number of deaths in Peru’s war (1980 – 2000) was around 69,000 with a 95% uncertainty interval (UI) of 61,000 to 78,000, that is, nearly 3 times the 25,000  documented deaths on the TRC’s 3 lists (after some consolidation).
  2. The Shining Path (SP) killed many more people than the Peruvian State did, despite the fact that the reverse is true among the documented deaths on the TRC’s lists.

Let’s turn now to the reply to Rendon made by Daniel Manrique-Vallier and Patrick Ball (MVB), two of the authors of the original TRC work.  This post just covers things that MVB do not do in their reply.

First, MVB do not directly defend any of the TRC estimates.  Rather, they concede that the TRC work has problems but argue that it is still better than Rendon’s work:

We agree that there are aspects of the TRC’s approach that should be improved – we have been working on this for several years.  However, our response here focuses specifically on Rendon’s proposal: we show that his methods are substantially weaker than the TRC’s and that his results are unsound.

Second, MVB do not challenge any of Rendon’s estimates for the total number of people killed in the war, for the total number of people killed by the Peruvian State or for the total number of people killed by “Other” groups.  Instead, MVB dispute only some of Rendon’s estimates for the number of people killed by the SP.

Here is Rendon’s main table of estimates:

And here are the TRC’s estimates:

The TRC’s estimate for the total number of people killed in the war was a primary point of emphasis in the TRC’s statistical report (point 1 above).  So it is notable that MVBs reply does not dispute Rendon’s estimate of roughly 48,000 total deaths (95% UI of 42,500 to 53,000), which is substantially below the TRC’s estimate (although still well above the 25,000 documented deaths on the TRC’s lists).  Nor do MVB challenge Rendon’s estimate for deaths caused by the Peruvian State, which is  almost 40% higher than the TRC’s estimate.  Rendon’s estimate for deaths caused by Other groups, which is far lower than the TRC’s estimate, also goes unchallenged.  Indeed, the MVB reply doesn’t mention these estimates.

Third, MVB do not criticize Rendon’s use of multiple imputation to assign known perpetrators to the deaths that were attributed to unknown perpetrators by the TRC.  Recall that the TRC lumped deaths caused by unknown perpetrators in together with deaths caused by Other (but known) perpetrators in their estimates. This conflation of unknown with Other perpetrators largely explains the big discrepancies for State and Other killings visible in tables 11 and 3 (above).  Indeed, MVB ‘s reply concedes that Rendon’s multiple imputations improve on the TRC’s work:

The TRC’s study has several limitations.  In particular, estimates stratified by perpetrator require accounting for records with missing perpetrator labels, as Rendon does.

Fourth, MVB ignore Rendon’s main SP estimate (Table 11 above).  Thus, they don’t actually engage with any estimate in Rendon’s central table.  Instead, MVB confine their criticisms to an SP estimate, and its components, that Rendon provides earlier in his paper, before he uses multiple imputation to move deaths from the unknown perpetrator category into the known perpetrator categories.

Fifth, MVB do not contest Rendon’s claim of sample selection bias against killings caused by the State in the TRC estimates.  This bias was introduced through the exclusion of nearly 3,000  documented deaths that could not be geographically located down to the stratum level. This table summarizes the exclusions:

The disproportionate exclusion of killings attributed to the State must have contributed to the TRC’s reversal of primary responsibility for killings from the State to the SP.  Rendon attenuates this bias in separate estimates that use only 10 regional strata and that reach conclusions similar to those in table 11 (above).  MVB do not discuss these alternative estimates in their reply.

Sixth, neither the MVB reply nor the original TRC statistical report discuss  how any of the data they use were actually gathered. This apparent lack of interest in data generation methodologies was always a curious feature of the TRC’s statistical report, given its strong emphasis on the use of capture-recapture to correct biases in these data sources.  MVB’s reply focuses centrally on a new dataset that was collected after the TRC finished its work.   Yet MVB are silent on how the new data were collected.

To be clear, both the TRC report and the MVB reply do discuss what is in the datasets that they use.  But, so far as I can see, neither document discusses the methodologies that were used to collect the data in the first place.  (In fairness, Rendon’s paper does not discuss data collection methodologies either.)

OK, I think those are the main things that MVB do not do in their reply.  In the next post I’ll turn to what they do do.

Note – I interchanged the order of the authors of the reply paper in the first version of this post.  I’ve now fixed this mistake, changing “BMV” to “MVB”.

Important New Violent Death Estimates for the War in Peru with Implications Beyond just Peru: Part 2

This is a follow up to last week’s post about new estimates,  published by Silvio Rendon, of human losses in the war in Peru, 1980 to 2000.

Important context is that the Truth and Reconciliation Commission (TRC) of Peru published estimates that were quite surprising in at least two respects:

  1.  The TRC’s estimate for the total number of people killed in the war, 69,000 with a 95% uncertainty interval of 61,000 to 78,000, was far higher than the roughly 25,000 documented deaths spread across the three lists that the TRC worked with.
  2. The TRC estimated that the number of people killed by the left-wing Shining Path (SP) guerrillas was much higher than the number of people killed by the Peruvian State, reversing the perpetrator pattern for the roughly 25,000 documented deaths that formed the basis for the TRC’s estimates.

My previous post addressed only point 2, the TRC’s transfer of primary responsibility from the State to the SP, which the Rendon estimates transfer back to the State.

Let’s now consider point 1.

Rendon’s main estimate for total deaths is 48,000 with a 95% uncertainty interval of 43,000 to 53,000.  This is substantially below the TRC’s estimate but still well above the 25,000 documented deaths on the TRC’s lists.

The TRC’s estimate of 69,000 (61,000 – 78,000) total deaths in the war comes from an MSE analysis (also known as capture-recapture) of the three lists held by the TRC.  Crucially, the non-State  component of this estimate, covering SP killings plus killings attributed to “other” groups, derive from an unusual indirect method that is meant to sidestep the problem of excessively sparse data in many strata (See my earlier post for details).

Rendon’s estimate breaks down into three main steps:

  1. Use multiple imputation to (randomly) assign perpetrators to deaths attributed to unknown perpetrators in the TRC data.
  2. Make direct capture-recapture estimates for all of the TRC’s geographical strata that admit direct estimation after step 1, i.e., all strata for the State, 11 strata for the SP (covering about 1/2 of documented deaths) and 9 strata on average for identified other groups (covering about a 1/3 of documented deaths).
  3. Use kriging, an interpolation method that incorporates spatial  correlation between sampled observations,  to extend the direct estimates for killings attributed to non-State groups to cover the entire country.

In addition, Rendon offers a separate estimate for which he divides the country into just 10 strata, in contrast to the TRC’s 58 strata. This coarser partition enables direct estimation for each perpetrator in each stratum.  Multiple imputation is again used to allocate to known perpetrators deaths attributed by the TRC to unknown perpetrators.  This estimate also comes out to about 46,000 with a standard error of around 3,500.

Rendon provides even a third estimate, using multi-level modelling, that turns out to be similar to the first two.  This one is, however, rather complicated and I will not try to describe it in the present blog post.

In short, all of Rendon’s estimates point toward numbers that are substantially lower than what the TRC estimated but substantially higher than the number of documented deaths.

Finally, consider the estimates of Rendon and the TRC for killings attributed to the State.  Both are direct capture-recapture estimates but Rendon alone uses multiple imputation to account for unknown perpetrators, pushing his  estimates nearly 40% above the TRC’s.  Rendon estimates roughly 28,000 State-caused deaths (standard error = 2,185) compared to roughly 20,500 (standard error = 1,718) for the TRC.  This difference suggest that the State may have evaded responsibility for quite a few deaths in the TRC’s accounting  because of the way the TRC lumped deaths caused by unknown perpetrators in with deaths caused by “other” perpetrators.

In my next post I will examine the response to the Rendon paper coming from two authors of the TRC estimates.