Quantitative Analysis of Textual Data for Social Sciences

Project Information

QUANTESS

Grant agreement ID: 283794

Project closed

Start date 1 November 2011

End date 30 April 2017

Funded under

Specific programme: "Ideas" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)

Total cost

€ 1 357 919,99

EU contribution

€ 1 357 919,99

1 357 919,99

Coordinated by

LONDON SCHOOL OF ECONOMICS AND POLITICAL SCIENCE
United Kingdom

Final Report Summary - QUANTESS (Quantitative Analysis of Textual Data for Social Sciences)

QUANTESS was designed to advance the field in the development of methods and tools for the quantitative analysis of social science text. Through applications in analyzing “text as data” for political and other social sciences, through both its numerous article publications and three major software packages published for the R language, it has achieved this outcome. Research outputs have included new methodologies for scaling latent quantities from text, including from bag-of-words methods and the automated application of dictionaries, for automatic coding of texts by combining crowd-sourced sentence annotation with statistical scaling, and unsupervised methods for uncovering latent quantities from text, using either word counts or human-annotated codes. All of these developments have been accompanied by a major software library for the R language, quanteda, that is currently downloaded by nearly 5,000 users per month. This tool enables powerful, flexible and fast natural language processing and quantitative analysis of text, using fully open-source, documented, and tested methods. Accompanying this package are spacyr (for tagging parts of speech, extracting entities, and parsing dependencies) and readtext (for making it easy to read any text into R, including converting them from a variety of formats). Finally, the project has enabled several community-building initiatives, including the founding of a Text as Data Society (which has held five annual conferences, including one hosted using project funds), a Text Analysis Developers’ Workshop, and educational dissemination activities to train students in the use of the methods and tools developed by the project.

Final Report Summary - QUANTESS (Quantitative Analysis of Textual Data for Social Sciences)

Download Download the content of the page