Tutorial: Information Extraction for Social Science Research | Andy Halterman

Tutorial: Information Extraction for Social Science Research

This workshop provides an interactive introduction to information extraction for social science–techniques for identifying specific words, phrases, or pieces of information contained within documents. It focuses on two common techniques, named entity recognition and dependency parses using the spaCy library, and shows how they can provide useful descriptive data about the civil war in Syria. It concludes with a brief application of question-answering models for social science information extraction.

A Colab notebook is available here.

You can watch a Youtube presentation of the tutorial as well.

I’ve given this tutorial as part of the NLP+CSS 201 tutorial series and at ICWSM 2022.

Andy Halterman
Andy Halterman
Assistant Professor, MSU Political Science

My research interests include natural language processing, text as data, and subnational armed conflict