Three Tasks in Automated Text Geocoding

I’ve been working a lot of automated geocoding of text over the last 6 months, and I’ve found myself consistently describing the same set of tasks or ways to extract location information from text. Here are some quick thoughts on how to schematize these geolocation tasks, relate them to each other, and where I think the future of the research is.

Continue reading “Three Tasks in Automated Text Geocoding”

Three Tasks in Automated Text Geocoding

CLIFF-up: Easy, Automated Geocoding of Text Documents

One of the most important trends in political science is the growth in subnational, geographically disaggregated quantitative research. The prerequisite for this research, of course, is having plentiful and high-quality georeferenced data. The software for generating georeferenced data are often difficult to build, scarce, or not easy to use. As part of my work with the Open Event Data Alliance to generate high-quality, freely available political event data, I’ve taken what I think is perhaps the best open source news text geocoding system, MIT’s CLIFF and packaged it into a virtual machine in the hope that anyone can set it up for their own use in a matter of minutes.

Continue reading “CLIFF-up: Easy, Automated Geocoding of Text Documents”

CLIFF-up: Easy, Automated Geocoding of Text Documents

ENCoRe Conference Paper: A New, Near-Real-Time Event Dataset and the Role of Versioning

Two weeks ago, I went to the fall conference for the European Conflict Research Network (ENCoRe) in Uppsala, Sweden. The research projects that the (exceptionally welcoming) presenters detailed are really interesting and almost universally involve the production of new datasets. I presented a paper I wrote with John Beieler on some of the work we’re doing with PETRARCH, the Open Event Data Alliance (OEDA), and the production of new event data.

Continue reading “ENCoRe Conference Paper: A New, Near-Real-Time Event Dataset and the Role of Versioning”

ENCoRe Conference Paper: A New, Near-Real-Time Event Dataset and the Role of Versioning

Developments in Event Data

By John Beieler and Andy Halterman

The two of us, along with Phil Schrodt, Patrick Brandt, Erin Simpson, and Muhammed Idris have been working on several interrelated projects that we believe will improve the availability and quality of event data. We’ve discussed these projects formally at ISA and informally at MPSA and elsewhere. But we think these issues are important enough that they bear repeating here. These four projects are PETRARCH, Phoenix, EL:DIABLO, and the Open Event Data Alliance (OEDA). (Fun game: which of these are acronyms, backronyms, and regular words?).

Continue reading “Developments in Event Data”

Developments in Event Data

Data Churn and Data Versioning

Jay Ulfelder has a very nice post about the problems that applied researchers face when working with data that changes rapidly in its availability and production. I agree with the suggestions he proposes (which were, very roughly, 1. modularity in applied uses, 2. transparency in generating data, and 3. awareness of the larger data ecosystem), and wanted to add my own. This is a lightly edited version of a comment that I left on his post.

Continue reading “Data Churn and Data Versioning”

Data Churn and Data Versioning

The Good Judgement Project and Bayes’ Calculator

It’s been a very forecast-y week. Between (finally) reading Phil Tetlock’s excellent Expert Political Judgement, going to a half dozen panels at ISA on forecasting and event data, and today’s NPR story about the Good Judgement Project, I’ve been thinking a lot about how to make political forecasts and how we know when they’re good. I wanted to share one of the tools I’ve been using for forecasting and calculating subjective probabilities.

Continue reading “The Good Judgement Project and Bayes’ Calculator”

The Good Judgement Project and Bayes’ Calculator

ISA 2014 Paper

The International Studies Association had its annual conference last week in Toronto. I met many people I knew only online or from the reference sections of papers, and had a overall great time.

Here’s the paper and slides that I wrote with my co-author, Jill Irvine on first steps toward measuring political mobilization using event data, specifically GDELT.

Because of the ongoing legal controversy around GDELT, we have no immediate plans to submit the paper for publication, but we welcome any feedback on the methodology and plan to use it in the future.

ISA 2014 Paper