As political scientists, we are often interested in using text to understand the actions of political actors. Thankfully, have a growing set of tools for identifying political actors in text, including named entity recognition and dependency parses, custom event models, or hand labeling events text.
POLECAT is a new global event dataset for social science research, coded in the PLOVER event ontology.
Much of my work involves improving large-scale systems to extract political events from text (see code from our NSF project on the subject here). These systems are designed for full production use over many hundreds of sources both daily and for the past in many dozens of event categories, including protests, armed conflict, statements, arrests, and humanitarian aid.
This tutorial covers how to create event data from a new set of text using existing Open Event Data Alliance tools. After going through it, you should be able to use the OEDA event data pipeline for your own projects with your own text.
I was going through the Petrarch2 dictionary code, working on live updating of dictionaries for the human coding interface, and ended up taking a look at the dictionaries’ coverage of different CAMEO event codes.
I’ve been working a lot of automated geocoding of text over the last 6 months, and I’ve found myself consistently describing the same set of tasks or ways to extract location information from text.