Event Data in 30 Lines of Python

Much of my work involves improving large-scale systems to extract political events from text (see code from our NSF project on the subject here). These systems are designed for full production use over many hundreds of sources both daily and for the past in many dozens of event categories, including protests, armed conflict, statements, arrests, … Continue reading Event Data in 30 Lines of Python


Managing Machine Learning Experiments

Reproducible methods like knitr and version control using git are on their way toward being standard for academic code, even in social science disciplines such as political science. knitr, Rmarkdown, and Jupyter notebooks make it easy to verify that your findings and figures come from the most recent version of your code and that it … Continue reading Managing Machine Learning Experiments

Three Tasks in Automated Text Geocoding

I've been working a lot of automated geocoding of text over the last 6 months, and I've found myself consistently describing the same set of tasks or ways to extract location information from text. Here are some quick thoughts on how to schematize these geolocation tasks, relate them to each other, and where I think … Continue reading Three Tasks in Automated Text Geocoding