Periodic Reporting for period 1 - QUANTEDA (User-Friendly Software For the Quantitative Analysis of Textual Data)
Reporting period: 2018-10-01 to 2020-03-31
The technical challenge of the project involved building a web application around software built in the R language, which is designed to be a tool for single users, rather than the backend for scalable SaaS solutions. Working with a research officer and our technical subcontractor Appsilon Data Science (based in Warsaw), we devised a way to scale virtual "containers" running the software on a cloud platform, and to connect this to a secure user authentication system that could be linked to an e-commerce site from a web page. Through successive rounds of testing, we designed and built the application, overcoming several technical challenges required to make this work. Working with groups of volunteer users, we cycled through several iterations of design of the user interface, to improve the user experience and the functionality of the application.
We proved the concept, but also learned a great deal along the way to improve the efficiency of the design. R and the tools we built around it to provide the web application work, but they are not very cost effective. It remains to be seen, in fact, whether the application can be run at large scales at costs below what we could feasibly charge users to access. As a result, we are currently working on a prototype built on a different architecture that is much more efficient and scalable, but still built on the quanteda R analytic engine.
As a concept to prove that commercial software as a service can be built on an open-source project, generating a profit that also supports the open-source development of benefit to (and very popular with) the scientific text analysis community, we think that this model can be very successful. The next phase will require refinement of the functionality and user interface, then a marketing campaign to sign up paying users. These revenues can then be used to continue to improve the product.