Tutorial: Information Extraction for Social Science Research
This workshop provides an interactive introduction to information extraction for social science–techniques for identifying specific words, phrases, or pieces of information contained within documents. It focuses on two common techniques, named entity recognition and dependency parses using the spaCy library, and shows how they can provide useful descriptive data about the civil war in Syria. It concludes with a brief application of question-answering models for social science information extraction.
A Colab notebook is available here.
You can watch a Youtube presentation of the tutorial as well.
I’ve given this tutorial as part of the NLP+CSS 201 tutorial series and at ICWSM 2022.