Periodic Reporting for period 2 - CLS INFRA (Computational Literary Studies Infrastructure)
Reporting period: 2022-09-01 to 2024-02-29
The project therefore aims to consolidate, integrate and further develop institutional, national and regional efforts to build shared and sustainable access to the high-quality data, tools and knowledge in the field of literary studies, in general, and Computational Literary Studies (CLS), in particular. CLS INFRA uses a balanced set of Networking, Joint Research, and Transnational Access activities to instigate the transformation of the CLS community from one based on informal exchanges of knowledge and resources to one based on a shared infrastructure, while, at the same time, creating the necessary conditions for the wider adoption of digital technologies in traditional literary studies and other related disciplines. To achieve its aims, CLS INFRA will pursue the following specific objectives:
Objective 1: Bridging knowledge-based resources for CLS community.
Objective 2: Mapping and matching specific requirements of CLS community.
Objective 3: Developing new tools and services for CLS users.
Objective 4: Mainstreaming of new tools/services.
Objective 5: Strengthening culture of cooperation.
Work Package 1 revolves around project’s management and coordination. WP1 delivered two updated versions of the Data Management Plan (Deliverable 1.1)
Work Package 2 has been responsible for the project’s communication and dissemination. Apart from setting up the project's website and disseminating the project's activities on social media, WP2 has prepared a detailed Communication Plan (D2.1) that describes communication and dissemination strategies for the CLS INFRA project, and ensures that the results of CLS INFRA are exploited across all appropriate user communities.
The overarching objective of WP3 is to identify, document and show-case current shared practices in CLS research. To this end, WP3 published a detailed report Baseline Methodological User Needs Analysis (D3.1) explored new means of dissemination by publishing an interactive Survey of Methods (D3.2) and completed a series of survey papers on methodological concerns (D3.3).
WP4 completed the "Skills support gap analysis" (D4.1). The main task of this deliverable was to explore current gaps in teaching of research skills for computational literary studies to inform CLS INFRA project’s own approach to training schools and chart the territory to gain broader insight into current CLS teaching practices. WP4 has also organised two Training Schools, and two workshops in CLS.
WP5 has focused its work on the deliverable D5.1 Review report documenting the state of literary data. The landscape review focuses on intellectual access, i.e. providing guidance for finding and sharing literary data, and consisting of collecting and analysing literary corpora, available formats, tools, and metadata. Next deliverables include "Case studies in data preparation and sharing" (D5.2) and "Toolkit report for data sharing" (D5.3).
In order to achieve one of the main objectives of WP6 – to create a catalogue of existing literary corpora in Europe – special attention was paid to compile and consolidate a manually curated extensive collection of literary corpora and their metadata in close cooperation with WP5. Next to becoming the initial dataset for the catalogue, this inventory has informed the development of the data model underlying the catalogue. Deliverable "Extended transformation matrix including alternative formats" (D6.3) has also been completed and published.
The work in WP7 focuses on the conceptualisation and technical prototyping of a Programmable Corpus. First, in exchange with the partners in CLS INFRA, work has been done on the development of a domain-specific ontology for transnational drama corpora, which is currently being tested under the name DraCorOn; work has also been done on various schemas (including an ODD file). On the other hand, tools were developed to support users (with and without CLS expertise) in homogenising corpora in the sense of the Programmable Corpora approach and thus making them findable and usable in the supported infrastructure. Reports on programmable corpora (D7.1) on versioning the living corpora (D7.3) and a set of tools in R and Python to query DraCor (D7.2) have been published.
WP 8 focuses its efforts to optimise the availability of fundamental NLP tools within a workflow for literary texts. Report of the tools (D8.1) and Report on annotation as enrichment (D8.2) have been published.
WP9 is related to the management and oversight of the TNA selection process, and revolves around the recruitment and administration of the activities of the External Advisory Board. It has completed two calls of TNA research stays, with 31 successful applicants and 187 weeks awarded.
As WP4 is concerned, potential impact of its activities lies primarily in the Training Schools. As outlined in the GA, they provide young researchers of different discipline background with a crash course in essential skills needed for textual analysis. Training schools are designed in a way to give them competencies that should improve their position on the labor market and potentially propel forward research and knowledge dissemination in their respective home countries.
In WP5, the preliminary results of the study of the narratives of terrorism will be presented in November 2022 at the congress of Association for Eurasian and Eastern European Studies in Chicago. Aiming to examine the tradition of Slovak novels, WP5 directed a cooperation proposal to a team based in the Slovak Academy of Sciences, preparing a collection of novels that will comply with the standards set by the ELTeC collection.
WP8 is making progress in developing, maintaining and testing the methods/tools/workflows that will be provided in this project (i.e. D8.2-5). This includes work on TEI, NER and sentimental analysis in various languages. WP8 has ongoing discussions on how to best make the materials and documentation of these tools available, as well as which standards to strive for in providing them. The work of WP8 thus pushes the state of the art as it seeks to develop multilingual toolchains for scholars working on historical literary materials. Typically, these workflows have only been tested on contemporary sources and thus every release of a new tool or its evaluation is a step in the positive direction for this field. All these tools will be available for free and open use.