Periodic Reporting for period 4 - ChronHib (Chronologicon Hibernicum – A Probabilistic Chronological Framework for Dating Early Irish Language Developments and Literature)

Reporting period: 2020-03-01 to 2021-04-30

Early Medieval Irish literature (7th–10th centuries) is vast in extent and rich in genres, but owing to its mostly anonymous transmission, for most texts the precise time and circumstances of composition are unknown. Aside from a small number of texts that contain unambiguous historical references, the only clues for a rough chronological positioning of the texts are to be found in their linguistic features. Phonology, morphology, syntax and the lexicon of the Irish language changed considerably from Early Old Irish (7th c.) into Middle Irish (c. 10th–12th centuries). However, only the relative sequence of changes is well understood; for most sound changes very few absolute dates have been proposed so far.
Aims and goals: It was the aim of Chronologicon Hibernicum to find a common solution for both problems: through the linguistic profiling of externally dated texts (esp. annalistic writing and sources with a clear historical anchorage) and through serialising the emerging linguistic and chronological data, progress was to be made in assigning dates to the linguistic changes by using statistical methods and by estimating dates using Bayesian inference.
On this basis, a much tighter chronological framework for the developments of the Early Medieval Irish language can be created. More precise information about the date of language development can permit to find new dates for hitherto undated texts. This will lead to a better chronological description of medieval Irish literature as a whole, which will have repercussions on the study of the history and cultural and intellectual environment of medieval Ireland and on its connections with the wider world, leading to a better understanding of the historical development of Ireland.
The data studied and analysed in the project are collected in the database Corpus PalaeoHibernicum (CorPH). Major advancements of the project consist in the method of Variation Tagging, which adds metadata about linguistic variation and change to a corpus, and Bayesian Language Variation Analysis, which allows the modeling of language change over time for under-documented languages. These methodologies are transferable to other languages as well, thereby adding a new scientific method for shedding light on obscure periods of the past.
In the first 30 months of the project, the Chronologicon Hibernicum team have been working on laying the linguistic-phonological groundwork on the basis of which the more advanced goals (e.g. statistical approaches) of the coming years can build. The team has been creating and adapting lexicographic databases of key Old Irish texts, such as the Annals of Ulster, the so-called 'Minor Glosses' of Old Irish, and related texts. It has been one of the central issues of the work so far to agree on a standard of analysing and annotating Old Irish texts in a way which reflects the subtleties of synchronic variation and diachronic change of the language, and which makes best use of the possibilities of current computational technology.
The second part of the project was dominated by the creation of the lexicographic database Corpus PalaeoHibernicum (CorPH; URL: It includes 78 texts (of very diverse length) from the Old and Middle Irish periods, totalling appr. 135,000 tokens, of which c. 110,000 are Old Irish. Many of the included texts have been checked against the manuscripts in the process of data entry and it has been possible to correct many textual errors that had been perpetuated in the past. All tokens (termed 'morphs' for Irish-language elements) in the database have been deeply annotated for information such as POS, morphological analysis, but also for Variation. The development of a system of Variation Tagging is one of the methodological advances brought about by the project. The other major innovation was the development of a statistical method, based on Bayesian statistics, for modelling linguistic change over time (Bayesian Language Variation Analysis).
Four workshops and conferences were organised in Maynooth (2016-9) on topics such as computational and corpus-based linguistics and variation and change in syntax and morphology. Key contributions to these workshops were published in the edited volume "Morphosyntactic Variation in Medieval Celtic Languages. Corpus-based Approaches". Eds. Elliott Lash, Fangzhe Qiu, David Stifter [= TiLSM 346], Berlin: de Gruyter 2020. doi 10.1515/9783110680744.
As a by-product of the project work, many new insights about the diachronic and synchronic linguistical variation of Old Irish (phonology, morphology, syntax) have been made, which will also have an effect on our understanding of Old Irish grammar in general. Through the use of social media, the project members have not only propagated the results of the project research, but have also raised awareness of Early Irish and its importance for the intellectual history of Ireland and Europe as a whole. The project has been in exchange with other researchers and projects in Early Irish studies, and has fostered these collaborations in workshops.
The major progress that has been achieved in Chronologicon Hibernicum relates to the method of analysing Old Irish grammar and annotating it in texts, and to depth of analysis achieved. This is a major stepping stone towards a comprehensive synchronic and diachronic description of the Old Irish language, a description that not only reflects the traditional grammatical categories, but one that is also capable of reflecting in a standardised way the synchronic variation and the diachronic change of the language. This is the first global goal of the project, and it will allow to develop computational tools and statistical methods of quantifying language variation and change in the next period of the project.
The major results have been the development of two historical-linguistic methods of describing and modeling language variation and change for under-documented languages: Variation Tagging and Bayesian Language Variation Analysis. The creation of the lexicographic database Corpus PalaeoHibernicum (CorPH) is a major advancement for the way how Old and Middle Irish will be studied in the future.
