This is the latest installment in a series that considers the statistical report done for the Peruvian Truth and Reconciliation Commission (TRC), Silvio Rendon’s critique of this statistical report and a reply to Rendon from Daniel Manrique Vallier and Patrick Ball (MVB) who worked on the TRC statistical report. The present post continues to discuss the MVB reply.

(Note that I may not resume this series until Silvio Rendon’s rejoinder is published. Meanwhile, I’m also working with Josh Dougherty of Iraq Body Count on an offshoot post that will cover the practice and pitfalls of matching deaths across multiple lists.)

Today I’ll comment on nine figures from the MVB reply: figure 1 in the main body of the paper and figures 2-9 in the appendix.

I won’t produce any of the figures here because they are misleading and a picture is worth a thousand words. The main features I object to are that the figures substitute lower (preliminary) stratum-level estimates for Rendon’s main estimates and suppress the uncertainly surrounding these estimates. Moreover, MVB portray some of these these lowered point estimates as falling within an “impossibility region,” a characterization which further assumes that MVB’s matching of deaths across sources was perfectly executed on fully accurate data.

Nevertheless, the figures do convey some interesting simulation-based information that addresses the question of when a direct estimation approach outperforms MVB’s indirect one and vice versa. Each of the nine figures uses data from a stratum for which one can directly estimate Shining Path (SP) deaths. (There are nine such strata before multiple imputation and two more, not covered by the figures, after multiple imputation.)

The X axis in each picture represents all the possible true values for the number of SP-caused deaths (with the true values indexed by N). MVB perform simulations that estimate the number of SP-caused deaths many times for each stratum and for each N using both direct capture-recapture and MVB’s indirect capture-recapture methodology. MVB then calculate the deviation of each estimate from the underlying true value, square these deviations (so that negative deviations do not cancel out positive ones) and take the mean of these squared deviations across all simulation runs for each value of N. Finally, they graph these “mean-squared errors” for each method and each N in all nine strata.

For eight out of the nine strata the direct method outperforms (i.e., has lower mean-squared errors) the indirect method for values of N below some critical value and the the indirect method outperforms the direct one above this same critical value. (For one stratum the reverse is true but there is never a big difference between the two methods in this stratum so this doesn’t seem to matter much.) For three strata the critical value for which the best performing method switches from direct to indirect is inside of MVB’s “impossibility region”.

In eight out of the nine strata the indirect method outperforms the direct method when the true number of people killed by the SP is set equal to the estimate that the TRC actually made for that stratum (using the indirect method). Essentially, this rather unsurprising result says that the indirect method performs well in simulations of cases for which the TRC’s indirect estimate delivered a correct result. And the indirect method also performs well when the TRC’s estimate is not spot on but still reasonably close to being correct.

The direct method tends to outperform the indirect one in simulations that start from the assumption that the direct estimate is correct. Nevertheless, in three out of the nine strata the indirect method actually wins this contest.

Overall, these simulation results tend to favor the indirect method over the direct one, especially when the true numbers are assumed to be rather high.

That said, the direct method in the simulations does not match Rendon’s main method because, again, MVB omit the multiple imputation step of Rendon’s procedures. Incorporating multiple imputation should shift the balance back towards Rendon. And, again, I would like to see a similar exercise performed on Rendon’s alternative approach that covers the whole country with ten strata.

Here’s one last point before I sign off. As of now, the MVB reply is still just a working paper, not yet published in *Research and Politics*. The main advantage of posting a working paper before publication is that you can respond to feedback. Thus, it would be great and appropriate for MVB to take advantage of the remaining time window by purging the misleading material about impossible point estimates without uncertainty intervals from the published version of their paper. (See post 4 and post 5 of this series in addition to the present one for further details.) This move would help lead us toward more fruitful future discussions.

We explained in previous replies why we compared unimputed counts against unimputed estimates: because the comparison shows the problems of the approach itself. Everything else follows from there.

In this post, you argue that we should compare the imputed estimates. Here they are: the count of the SLU killings including the records with imputed perpetrator is 19,636. This is greater than Rendon’s global SLU estimate of 18,341. Here’s a table of the imputed data relative to the 2003 counts:

in_2003 in_MIMDES nSLU_imputed

————————————————–

Yes Yes 4340

Yes No 6294

No Yes 9002

No No ??

TOTAL_SLU

19636.

As we have mentioned in previous posts, when we compare Rendon’s imputed estimates against imputed counts including MIMDES, Rendon’s estimates are still too low. They are still below the *clearly incomplete* total observed count. Keep in mind that these counts include the pre-TRC sources (which didn’t cover SLU much if at all, which is why we had to do the indirect estimation), plus the TRC data, plus the MIMDES data.

As it should be evident, these counts are still incomplete. After the CVR finished in 2003, MIMDES went to the field through 2006 and found over 9000 SLU victims that the CVR had not identified. Not only that: MIMDES missed 6290 victims that the CVR had previously found.

Given this, are you willing to argue that MIMDES and CVR essentially covered the whole universe of victims of SLU? Do you believe that a hypothetical new project would not find substantial numbers of additional victims? Boiling this down, do you really think that the entry ‘??’ in the table above should be 0?

We don’t. Thus the confidence intervals on Rendon’s estimates here are irrelevant. The counts are not the true value, they’re the absolute minimum. And Rendon’s work fails against that check.

This is not a new argument. It is a consequence of what we present in our paper, and is what we have repeated numerous times here. The underestimate is a direct consequence of starting from wrong assumptions.

Lastly, this involves matching individual victims, not events. The victims nearly all had two given names, two family names, years of birth and death, and locations of death. The matching is pretty easy.

LikeLike

Thanks you for this. Here are some immediate reactions.

At this point in the discussion I would agree that the true number of deaths caused by the Shining Path (SP) is probably higher than Rendon’s point estimates.

I continue to insist that it’s bad statistical practice to look only at point estimates. I’m disappointed that the two of you went ahead and published your paper even after this flaw was pointed out.

As of now I don’t see strong reasons to expect that credible documentation of SP-caused deaths would rise to a level that would make Rendon’s SP estimates look dubious, let alone rise to the level that would support the TRC’s SP estimate.

I could, potentially, change this view but I would need to see a lot of new evidence. A first step in trying to make such a case would be engagement with the data collection methodologies of the various efforts.

The TRC’s claim that the SP killed more people than the State did seems rather far fetched at this point. Possibly the numbers killed by the two groups are roughly similar but it is quite a stretch to place the SP well beyond the State as the TRC did. Related to this comparison is Rendon’s point, which doesn’t seem to be in dispute, that the TRC substantially underestimated deaths caused by the State.

I remain concerned about the opacity of your work. Perhaps it’s reasonable to treat 19,500 is a rock bottom documented number for SP deaths. But I would require more information to get me there.

LikeLike

Reply to Manrique-Vallier, point by point.

1. Hypothesis testing with point estimates

“We explained in previous replies why we compared unimputed counts against unimputed estimates: because the comparison shows the problems of the approach itself. Everything else follows from there.”

This is an incorrect mechanical inference, which is refuted by evidence. The multiple imputation estimates have to assessed as confidence intervals as well, not as point estimates. And the multiple imputed total average observations are INSIDE of the confidence interval that I have estimated:

The observation by MVB based on MIMDES-TRC data is 19636 . My MI estimate is 18341 with a confidence interval of [15637,21045]. Nobody can reject this estimate based on commonly accepted statistical theory.

Very clearly, MV is still doing simple arbitrary comparisons of assumed deterministic numbers: “Rendon’s estimates are still too low. They are still below the *clearly incomplete* total observed count.”

2. Beliefs about unobserved counts

“As it should be evident, these counts are still incomplete. (…) Do you believe that a hypothetical new project would not find substantial numbers of additional victims? Boiling this down, do you really think that the entry ‘??’ in the table above should be 0? We don’t.”

MV wonders whether the “??”, that is, if the number of unobserved victims is 0 (zero). He does not believe it is.

OK, and he may be right. It may not be zero. But that does not imply that that unknown number has

to be 11695, which is the difference between BASM estimate for the TRC (31331) and their observation of 19636 . Such big number is extremely unlikely. We always have to remember that the center of the analysis should be the TRC numbers by Manrique-Vallier, Ball and coauthors, which were done for a truth commission and as such have the aura of being ‘official’ numbers.

MV says that the TRC found around 9000 more Shining Path victims than previous sources and that MIMDES found 6290 more victims than the TRC and previous sources.

Is it believable that a new data collection project will find 11695 more victims?

I do not think so. Moreover, I do not see Manrique-Vallier, Ball or anyone answering “yes” to this question.

The whole thing here is that MV and Ball are accountable for the TRC work of 2003 and they have not way to justify such a large and inaccurate prediction. It is THEIR methods that are flawed. So they attempt to deviate the attention away from their work in a unilateral discussion.

Suppose that a new data collection project finds 3000 more victims for the Shining Path. A new total figure of found 22600 victims by the SP would still be very far from the 31000 BASM estimate, but also still closer to my MI upper limit than to the TRC lower limit.

3. Are confidence intervals irrelevant?

“the confidence intervals on Rendon’s estimates here are irrelevant. The counts are not the true value, they’re the absolute minimum. And Rendon’s work fails against that check.”

What fails here the logic that underlies this reasoning.This statement is all wrong.

That an observed value is claimed to be an absolute minimum does not imply that an estimate can be just taken as a point estimate. It is still a random number with a corresponding confidence interval. Order statistics can be minima and they are still random numbers. “Absolute” and “minimum” are not synonyms of “deterministic”. We have to compare random numbers with each other, so that confidence intervals are by no means irrelevant. In his own understanding of statistics, MV insists in comparing just point estimates.

LikeLike