Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

User-Friendly Software For the Quantitative Analysis of Textual Data

Periodic Reporting for period 1 - QUANTEDA (User-Friendly Software For the Quantitative Analysis of Textual Data)

Período documentado: 2018-10-01 hasta 2020-03-31

The aim of QUANTEDA was to prove that it was possible to build a user-friendly version of text analysis software built on top of the quanteda family of R packages that I developed originally under ERC-2011-StG- 283794-QUANTESS, Quantitative Analysis of Text for the Social Sciences. This "software-as-a-service" (SaaS) would provide a graphical user interface, require no programming experience, run entirely as a web application requiring only a modern web browser for usage, and whose interface is easily configurable for any language. Running in the cloud, this application would be available in free trial versions and by subscription to unlock more powerful features. Subscriptions for the application would fund a company that could also support the continued development of the open-source quanteda library, which is also used directly in R by hundreds of thousands of users worldwide.

The technical challenge of the project involved building a web application around software built in the R language, which is designed to be a tool for single users, rather than the backend for scalable SaaS solutions. Working with a research officer and our technical subcontractor Appsilon Data Science (based in Warsaw), we devised a way to scale virtual "containers" running the software on a cloud platform, and to connect this to a secure user authentication system that could be linked to an e-commerce site from a web page. Through successive rounds of testing, we designed and built the application, overcoming several technical challenges required to make this work. Working with groups of volunteer users, we cycled through several iterations of design of the user interface, to improve the user experience and the functionality of the application.

We proved the concept, but also learned a great deal along the way to improve the efficiency of the design. R and the tools we built around it to provide the web application work, but they are not very cost effective. It remains to be seen, in fact, whether the application can be run at large scales at costs below what we could feasibly charge users to access. As a result, we are currently working on a prototype built on a different architecture that is much more efficient and scalable, but still built on the quanteda R analytic engine.

As a concept to prove that commercial software as a service can be built on an open-source project, generating a profit that also supports the open-source development of benefit to (and very popular with) the scientific text analysis community, we think that this model can be very successful. The next phase will require refinement of the functionality and user interface, then a marketing campaign to sign up paying users. These revenues can then be used to continue to improve the product.
Mi folleto 0 0