Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

From Digital to Distant Diplomatics

Periodic Reporting for period 2 - DiDip (From Digital to Distant Diplomatics)

Reporting period: 2023-07-01 to 2024-12-31

In the web portal “Monasterium.net”, one can find stories about fugitive robber barons, heroic deeds, aiding and abetting flight or religious splitting of families by keyword search or a few mouse clicks. But the portal with its more than 600,000 medieval and early modern documents is also a testimony to the uniformity and diversity of legal culture in Europe. To properly classify these histories, one must know what people in the past wanted to record in documents, how they did it and what they used them for.

“Diplomatics” is the scientific discipline that addresses these questions – and it has been around for over 350 years. However, the established methods are not sufficient to deal with the large number of documents that have been created in Europe since the 13th century. The project “From Digital to Distant Diplomatics” (DiDip) will therefore bring diplomatics into the digital present. It aims to enable anyone interested in medieval and early modern documents to make use of the latest developments in data science and artificial intelligence when dealing with the documents.

For this, the computers need many examples to “learn” – and they need people to interpret the suggestions they make. That’s why we need an environment where humans and machines work together: Humans contribute their creativity and the ability to “understand” other people, as well as to draw meaningful insights from their experience with objects, to tell something about the past. The machine can process large amounts of data quickly, applying rules as well as learning new ones. The DiDip project will develop such a “Virtual Research Environment”.

The project will test the usefulness of the research environment by investigating European trends and regional differences in the production and use of 14th and 15th century charters. What influence do pan-European political institutions such as the Roman Church have on regional practices? How do local and regional practices react to the spread of Roman law among European legal thinkers? How do the two widespread authentication practices, by seal and by notarial signature, relate to each other? These questions will be answered by the project team using computer vision and machine language processing to identify trends, breaks, unifications and diversifications. The observations thus made on the digital representations of the documents will be related to European “major events” such as the Western Schism (1378-1417) or the Great Plague (1348/49) and the economic crisis that followed it.
The first three years of the project were dedicated to

- an international conference "From Digital to Distant Diplomatics" in September 2022 (https://didip.hypotheses.org/conference-2022/(opens in new window)) and the preparation of the publication in the following months;
- experiments in the three main research fields of the project (NLP, Computer Vision, Diplomatics). We found methods to identify major elements of the charters on the images, handle the linguistic diversity of the texts, identifiy textual genres and text re-use between the charters themselves and to external texts.
- dissemination activities: conference presentations with contributions of several members of the team (DHd2022 Trier, CVPR2022 New Orleans, Computational Humanities 2022 Amsterdam, AIUCD Siena 2023, Gallia Pontificia Workshop 2023 Paris, COLIBRI Workshop Text Mining & NLP 2023 Graz, ICMS Kalamazoo 2023, DH2023, conference on “The Potential of Prosopography for Historical and Art Historical Studies on the Charterhouses and the Carthusian Order” in Ljubljana 2023, Arbitration workshop Brno 2023, Formualicity conference Amsterdam 2024, ICDAR 2023 San Francisco, Biblissima+ AI 2024 Paris, DH2024 Washington DC, DHd 2024 Passau, TPDL conference 2024 Lubljana, AltRecSys conference 2024 Bari, ICARUS convention 2024 Naples) and several individual project presentations of the project by the PI and the team;
- extracting, sanitizing, and converting existing data in Monasterium.net;
evaluating solutions for the technical infrastructure; design of a new system and implementation of a "file system database" as the core backend of the new infrastructure. Prototyping a enhanced presentation and search interface, and HTR. See https://github.com/Didip-eu(opens in new window) for code resulting from the project.
- establishing contact to archives and acquisition of new data from Germany, Italy, Netherlands and France
- training events: Winter school "Computer Vision for Humanists" Graz 2023 (https://didip.hypotheses.org/1574)(opens in new window); Summer School "Computational Language Technologies for Medievalists" Graz 2024 (https://didip.hypotheses.org/nlp-summer-school-2024(opens in new window)).
The project demonstrated with first experiments that basic visual features of medieval charters which are of interest to the diplomatist can be identified with machine learning methods. It established, where "formulaicity" identification methods from computer linguistics diverge from the concept of "formulaicity" in diplomatics and applied this to select a novel corpus on arbitration related charters in Monasterium.net. It created models for language detecting fitting to the description and transcription of medieval charters. We plan to study text re-use, information extraction, palaeographical analysis in research questions like relationship between charter texts and legal language or European variants of notarial practices with these digital methods. The retrieval of the full monasterium.net collection will be enhanced with machine learning based methods.
Project Logo
My booklet 0 0