What can you do with the Peru Data?

Somebody asked a fair question in the comments surrounding the release of the Peru dataset: what can you do with it?

That is a very big question that I can’t fully address in a blog post.  Still, I’ll try to offer a few useful thoughts.  Perhaps some readers will jump in with better ideas.  Also, I’d be delighted to hear from anyone who downloads the data and does something interesting with it.

Here’s some background.

First of all, it is event data .  This means that each line in the spread sheet is a discrete occurrence, such as a battle or a massacre.  There are a bunch of pieces of information about each event such as the date, location, number of people killed, violent actors involved, type of event, etc..

The methodology documents posted on the conflict data page give a fair amount of detail on what is in the data and what the criteria are.  It also could be useful to read this data description for the Colombia conflict database (which is also posted on the conflict data page.)  Of course, they are different conflicts and different databases but the methodologies are very similar.

This paper by David Fielding and Anja Shortland used the Peru data to demonstrate escalation cycles (my phrase, not the authors’) in the conflict:

We show that an increase in civilian abuse by one side was strongly associated with subsequent increases in abuse by the other. In this type of war, foreign intervention could substantially reduce the impact on civilians of a sudden rise in conflict intensity, by moderating the resulting ‘cycle of violence’.

I’m afraid that the published version of their paper is behind a paywall but it should be possible to get hold of it if you really want to.

I believe that Fielding and Shortland didn’t use the event character of the data specifically, instead aggregating the events into monthly time series.  However, in this paper we focused entirely on events, focusing on their sizes and timings:

Many collective human activities, including violence, have been shown to exhibit universal patterns1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15, 16, 17, 18, 19. The size distributions of casualties both in whole wars from 1816 to 1980 and terrorist attacks have separately been shown to follow approximate power-law distributions6, 7, 9, 10. However, the possibility of universal patterns ranging across wars in the size distribution or timing of within-conflict events has barely been explored. Here we show that the sizes and timing of violent events within different insurgent conflicts exhibit remarkable similarities. We propose a unified model of human insurgency that reproduces these commonalities, and explains conflict-specific variations quantitatively in terms of underlying rules of engagement. Our model treats each insurgent population as an ecology of dynamically evolving, self-organized groups following common decision-making processes. Our model is consistent with several recent hypotheses about modern insurgency18, 19, 20, is robust to many generalizations21, and establishes a quantitative connection between human insurgency, global terrorism10 and ecology13, 14, 15, 16, 17, 22, 23. Its similarity to financial market models24, 25, 26 provides a surprising link between violent and non-violent forms of human behaviour.

The Peru dataset was one of many we used in that article,.which was about patterns in the size distributions and timings of events that appear in war after war, not just the war in Peru.

The reader’s comment also asked about possible projects for undergraduates.  I’m not sure how to answer this question without knowing more about what kinds of undergraduates we’re talking about and what kinds of skills they have.  But students could certainly do various data manipulation exercises such as breaking down the data by region, perpetrator or type of event.

I hope that this post was useful.  I would be happy to respond to further questions.

 

 

Advertisements

Predicting Armed Conflict Events

Zee Media is launching a news channel and web cite and, somehow or other, I wound up as featured in their web site launch.

Here is the link.

Although I find it a bit harrowing to watch myself so much on camera I would say that Zee came up with an interesting and original way to present the material.

The basic idea for the piece came from this blog post from a few months back.  This research programme has been more about armed conflict than about terrorism whereas the Zee piece has pretty much the opposite priority.  Still, I think their angle works pretty well.

Like many media outlets Zee was very interested in the possibility of prediction.  Hopefully, viewers will come away from the piece with realistic expectations about the potential for prediction.  I doubt we will every be in a position to mine past patterns and then make a useful prediction saying that there will be an attack at a particular time and place.  But I do think that it is possible to make useful predictions about broad patterns in violent events, such as the relative numbers of attacks of different sizes.

I would add that we should put a lot of effort into making predictions because this is the best way for us to learn when our theories are working and when they need to be modified.  It is very easy to cling endlessly to faulty theories when you never test them with predictions.

PS – There were a couple of minor errors in the piece that I’m trying to get corrected, mainly identifying me as a mathematician rather than as an economist.

PSS – All the work that I describe in the Zee piece is joint with Neil Johnson and Stijn Van Weezel.  I mentioned this on camera but this information didn’t make it into the final version.