Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Morphologically Linked Old Irish Resource

Periodic Reporting for period 1 - MOLOR (Morphologically Linked Old Irish Resource)

Reporting period: 2023-10-01 to 2025-09-30

MOLOR: Morphologically Linked Old Irish Resource is a project with the aim of creating an online knowledge base—a graph encoding entities and their relationships—for Old Irish, a language spoken between 600-900 CE. Many digital resources for historical (and generally under-resourced) languages, including Old Irish, are ‘data silos’: while they may be comprehensive and of a high standard, the data (and hence, knowledge) remains scattered and cannot be queried and investigated effectively and efficiently, inhibiting progress in linguistic and philological study in a world which becomes increasingly more technologically-mediated. MOLOR contributes to creating lexicographical standards and interoperability between digital and computational resources for Old Irish, helping scholars to understand this challenging medieval Irish language better—Old Irish presents particularly complex issues given its elaborate inflectional system, manuscript transmission history, and the editorial decisions involved in producing normalised texts. The project goes beyond traditional printed dictionaries and grammars and uses state-of-the art web technologies. More theoretically, the project is interested in word-based grammatical variation in Old Irish, studied by the linguistic subdiscipline of morphology, and investigates how to best represent this variation using a set of canonical forms or lemmas (similar to dictionary headwords in more traditional terms). The main objective in the project is to build such a collection of lemmas—constituting the skeleton of an interlinked digital ecosystem of language resources and the linguistic data that they contain and describe. The impact of the project can be defined along four axes:
- Scientific—application of and contribution to standards for historical language lexicography
- Educational—better tools for understanding and teaching Old Irish
- Technological—usage of Linguistic Linked Data models and standards
- Cultural—preservation and accessibility of European linguistic heritage

Project website: molor.eu.
The main activity in the project involved building a comprehensive digital Lemma Bank for Old Irish using RDF technology, which allows different graph databases to talk to each other effectively. This Lemma Bank currently consists of approximately 6000 entries, mostly encompassing verbs and nouns. The project has demonstrated how digital tools can be adapted to work with historical languages that have limited resources and documentation. For modelling purposes, the LiLa ontology was used, a standardised framework originally designed for Latin (lila-erc.eu) and adapted it to work with Old Irish linguistic data. One of the key decisions was how to handle the messy reality of historical language data, and in particular, the variation found in dictionaries to lemmatise the same word. Old Irish texts show considerable variation in spelling, word forms, and grammatical analysis depending on the manuscript source and scholarly tradition. Rather than forcing artificial consistency through a single harmonised annotation system, the project opted to preserve the distributed nature of the data using properties to model morphological alternatives and spelling variants, inspired by the way LiLa has modelled lemmas. In other words, the project used specific technical properties to link alternative word forms and spelling variants, maintaining the integrity of different scholarly approaches while enabling digital processing and interlinking. While automatic extraction of lemmas in dictionaries/lexical resources was possible to some extent, building the Lemma Bank entailed careful aligning of dictionary/lexical resource entries and a significant amount of manual work and validation, with the purpose of attaining a high standard in this important novel resource. As for verb lemmas, the PF was assisted by an intern, whom he guided throughout the process of collecting and aligning.

The main achievements can be summarised as follows:
- The MOLOR project has applied the Linked Data paradigm to Old Irish, using semantic web standards and controlled vocabularies, which is a first
- For the purpose of collecting lemmas, the project has investigated and critically reflected on the lexicographical landscape for Old Irish, the weaknesses and strengths of the available resources, and lemmatisation choices
- The project has produced a MOLOR Old Irish Lemma Bank, crucial technology for future linguistic resource interlinking purposes

The following are important outcomes:
- The MOLOR Old Irish Lemma Bank and the process of its creation may act as an inspiration for scholarly communities working on other historical and ancient languages
- In the spirit of Linked Data, everyone can link to this new resource in a completely open and free manner
- The project has contributed to uncovering European linguistic heritage, including linguistic heritage and multilingualism, important pillars in Horizon Europe
The project successfully created a flexible digital framework that can accommodate the complex nature of Old Irish grammar and spelling variations. The Lemma Bank can grow to include more words and grammatical categories over time, and it allows different linguistic resources to be connected and searched together. This means that researchers will in time be able to ask complex questions about grammar and vocabulary across multiple resources and texts. This approach could serve as a model for digitising other under-resourced historical languages, showing how modern semantic web technologies can be successfully applied to ancient linguistic materials. As mentioned above, further uptake is safeguarded by the fact that everyone can link to this new resource in a completely open and free manner. Key needs include enlarging the Lemma Bank with more entries and word categories to exhaustively cover the lexicon of Old Irish.
logo-circle-inverse2.png
My booklet 0 0