From Translation to Creation: Changes in Ethiopic Style and Lexicon from Late Antiquity to the Middle Ages

Final Report Summary - TRACES (From Translation to Creation: Changes in Ethiopic Style and Lexicon from Late Antiquity to the Middle Ages)

The main scope of the TraCES project ( was a new approach to the study of a written heritage: a literary history in its connection to language history as well as material transmission history. During its five-year run time, the project aspired to analyze in detail the lexical, morphological and stylistic features of Ethiopic texts depending on their origins. To this end, the traditional linguistic and philological methodologies were enriched by the application of the advances of digital humanities.
The research was founded upon a newly created digital text corpus of critically established texts. The team annotated important texts from different periods of Geez literature with the help of a complex multi-level tool for morphological annotation (GeTa, see specifically developed for the project. For that, the team defined a project specific extensive morphological tag set. The tool offers synchronization between the Geez script and automatically generated but manually correctable transcription and allows various tokenization, lemmatization and annotating possibilities.
The annotated corpus is exportable to the ANNIS web-based visualization and search tool ( that offers a possibility of frequency and collocation analysis that may reveal significant changes in grammatical and lexical choices across centuries. The resulting understanding of the history of the Geez language and of the Ethiopian creativity and literary activity helps us establish features and criteria that may facilitate determining the origins of texts when the direct Vorlage is missing as well as evaluate the innovations when approaching critical text editions. The literary transmission and dissemination processes will be analyzed by contrasting and connecting Ethiopian late antique and medieval heritage with its parallels and antecedents in Near East and Mediterranean, contributing to our understanding of the cultural networks of the Christian Orient.
At the same time, an innovative digital lexicon of the Geez language has been developed, the first such dictionary available online. The fundamental Lexicon Linguae Aethiopicae by A. Dillmann has been processed and subjected to data mining to create the basic database of roots and dependent lexemes. The data have been augmented by the graphic variants automatically generated within the root tool on the basis of the graphic interchangeability data, the word list mined from the dictionary by W. Leslau, as well as lexemes and named entities encountered in the annotated corpus. The dictionary, available at offers a thesaurus function, showing actual example of word usage from the project corpus.
The project also produced an innovative morphological parser of Geez, based on a set of rules and paradigms as well as annotation data. With the help of Alpheios plug in it can be applied to any text in Ethiopic on the web, providing hints on possible translation and morphological analysis.
Metadata in TEI-XML format for the texts and named entities has been deposited on an external web resource (; the annotated texts provide links to the detailed descriptions of works, authors, translators, and personal and place names. The portal also hosts the philologically annotated digitized texts produced by the ERC project. The texts are searchable online and are harvested by the dictionary application.
As an additional research tool and a by-product, the project offers the service of managing a web portal collecting information on all Ethiopic texts available in digital format, whether in Hamburg or elsewhere in the world ( A comprehensive bibliography on Ethiopic linguistics and related fields, with searchable tags, references to related publications and reviews, as well as online repositories (wherever applicable) has been additionally developed and placed at the general disposal at
With 13 conferences and workshops organized and over 70 papers and talks presented, the project has widely disseminated its research results. The metadata, software, and text corpus are all freely available and accessible online.