Silvio Rendon’s powerful rejoinder on the statistical work of Peru’s Truth and Reconciliation Commission (TRC) is finally out.
This is a big development so I’m starting a brand new series to cover it. Of course, this new series is related to the earlier one. But I’ll strive to make the present series self-contained so you can start here if you want to.
The sequence of events leading up to the present moment goes as follows.
I. The TRC of Peru publishes a statistical report that makes two surprising claims.
- Nearly 3 times as many Peruvians were killed in the war, 1980 – 2000, than the combined efforts of human rights NGO’s, Peru’s Ombudsman office and the TRC were able to document on a case-by-case basis.
- The Shining Path (SP) guerrillas killed more people than the State did – reversing the pattern of the combined list of documented deaths, which formed the basis for the TRC’s statistical work.
II. Silvio Rendon publishes a critique of the TRC’s statistical work. He also proposes new estimates which, compared to the TRC’s estimates, increase the number of deaths attributed to the State, decrease the numbers of deaths attributed to the SP and “Other” groups and decrease the total number of deaths for the war as a whole. His estimates are inconsistent with the TRC’s surprising conclusions.
III. Daniel Manrique-Vallier and Patrick Ball (MVB), two authors of the TRC’s original statistical report, reply with a critique of Rendon’s estimates, indirectly defending their own SP estimates.
My earlier series covers the above three developments.
IV. Now we have Rendon’s rejoinder which mostly attacks the original work of the TRC but also defends his own estimates from MVB’s critique.
I make one caveat before proceeding. Rendon’s work is replicable but I have not tried to replicate it. I’ll just assume here that Rendon’s claims are correct. This is a reasonable thing to do since no one has discovered a substantive error from Rendon in the debate so far.
Rendon’s rejoinder finds grave and terminal deficiencies in the statistical report of Peru’s TRC. There are several issues in play but today I’ll cover just the issue of overfitting, which turns out to be quite a big problem.
There is a footnote in the TRC statistical report itself that provides a good explanation for the dangers of overfitting. The quoted text can supplement or replace the above overfitting link or, if you prefer, you can try this very short alternative explanation:
Overfitting occurs when the model fits the data too closely and therefore models only these data and none other. As the number of parameters used to fit the model approaches the number of cells in the table, all of the available information has been used for the model fitting, and none remains for the estimation. The goal is to find a model that fits reasonably well, but not so well that the same model would fail to fit different data describing the same phenomenon.
The TRC statistical report also proposes a policy to avoid overfitting, although they ignore it for their SP estimates: reject models with goodness-of-fit p values that exceed 0.5. (Possible p values range from 0, i.e., no fit whatsoever, up to 1, i.e., a perfect fit.)
Given this policy it’s shocking to discover that the TRC bases its SP estimates on perfectly fitting models in 14 out of its 58 geographical strata.
In fact, I learned through correspondence with Silvio Rendon that these 14 cases of egregious overfitting form just the tip of an overfitting iceberg. In a further 12 strata the TRC models are so close to perfect fits that their goodness-of-fit p values round up to 1. Moreover, p values are between 0.7 and 1 for an additional 13 strata and between 0.5 and 0.7 for a further 8 strata. In other words, according to their own stated standards the TRC estimated SP-caused deaths off of overfit models in 47 out of their 58 strata. Most of these models are badly overfit.
To summarize, based on overfitting issues alone we should bin most of the TRC’s SP estimates.
And overfitting is just one of the problems that plagues these SP estimates. Stay tuned for more in part 2 of this series.
PS – You may want to check out a parallel series on the blog about the perils of matching deaths and events across lists. It focuses on Iraq data but co-author Josh Dougherty and I discuss connections that are relevant for the Peru discussion.