Periodic Reporting for period 1 - IMEP DCCP (Index of Middle English Prose: Digital Cotton Catalogue Project)
Berichtszeitraum: 2021-09-01 bis 2023-08-31
The project had three main overall objectives:
I Preparing a printed IMEP volume describing all Middle English prose from the Cotton collection housed in the British Library. Assembled by Sir Robert Bruce Cotton (1571-1631), this collection is one of the most famous manuscript collections in Britain.
II Developing a search tool capable of handling linguistic variation in Middle English. This tool was co-developed with the Text Laboratory of the University of Oslo for integration into the IMEP website.
III Drafting guidelines for the future development and digitisation work related to the online version of IMEP.
Development of the search tool took place in Oslo in collaboration with the project supervisor, Jacob Thaisen, who was responsible for language models, and with programming support from the Text Laboratory. The tool was scheduled to be tested during a secondment period at Cambridge Digital Humanities (CDH). Guidelines for further digitisation of the resource were submitted to the IMEP board in August 2023.
Results of the project were disseminated to an academic audience through five conference papers or posters, two peer-reviewed articles, and a workshop in Cambridge.
Communication to a popular audience occurred through blog posts, a popular article on Sciencenorway and a YouTube video published on the channel of the Faculty of Humanities at the University of Oslo.
While the search tool is awaiting further testing, its essential core functionality is in place and can adapted to other projects dealing with premodern spelling variation. The Python-based Fuzzy Search Script (IMEP-FSS) is now available as Open Access. Details about it have been accepted for publication in a forthcoming article on Open Research Europe (ORE). Consequently, the project will have impact on searching texts characterised by major linguistic variation.