Periodic Reporting for period 1 - VERITRACE (Traces de la Verité: The reappropriation of ancient wisdom in early modern natural philosophy)
Reporting period: 2023-04-01 to 2025-09-30
The project is designed around three main research objectives:
1 To provide a comprehensive roadmap of the incorporation of ancient wisdom writings into early modern natural philosophy
2 To map out the ancient wisdom discourse as it spread from Renaissance Italy to early modern Europe, with particular emphasis on its inclusion in natural philosophical debates
3 To trace the influence of perceived watershed moments in the reception and perception of these ancients wisdom writings
The impacts of VERITRACE go way beyond its current scope. It is intended to set a new methodological standard for historical research by relying not on predefined hypotheses, but instead on research questions to which the answer is simply not known. It addresses these by exploring hundreds of thousands of sources to provide detailed and statistically signficant answers, and to explore how these answers migrate across various temporal, linguistic, geographical, and cultural domains. Its deliverables not only include an incredibly rich and detailed dataset covering printed volumes in all major European languages from 1540-1728 with unique identifiers for each object included compatible with the Universal Short Title Catalogue and other meta-standards; it also includes a fully developed web-based user interface for research, as well as a plethora of compatible open-access scripts, to be used with future projects.
Our development of a comprehensive web application represents a significant technological achievement, creating a functional digital workbench that enables sophisticated searches across approximately 430,000 texts from our three primary data sources. This technological accomplishment realises the digital infrastructure envisioned in WP1 and provides the foundation for the LSA and Sentiment Analysis techniques central to our methodology.
The successful acquisition and standardisation of the complete Distant Reading Corpus constitutes another major achievement. We have systematically collected and integrated texts and metadata from EEBO, Gallica, and the Bavarian State Library, creating the largest corpus ever assembled for studying ancient wisdom discourse in early modern Europe. This achievement required substantial negotiation, particularly with the Bavarian State Library, and extensive technical work, establishing the empirical foundation for addressing all three research objectives.
Our peer-reviewed publication "The Challenges of Multilingualism in the Search for Ancient Wisdom" (late 2024) in the CHAI Workshop proceedings represents the first peer-reviewed contribution emerging from the project, demonstrating the methodological innovations we are developing for cross-linguistic analysis of historical texts.
Finally, our proof-of-concept for Large Language Model integration through successful demonstration of LLM-assisted metadata enrichment and cleaning represents an unexpected methodological breakthrough that advances state-of-the-art in digital humanities while addressing practical challenges in managing large historical datasets.
The development of sophisticated text matching capabilities across multiple early modern languages (Latin, French, German, Dutch, English, Italian) represents an advance in cross-linguistic historical text analysis. The technical challenges addressed in our peer-reviewed Conference Proceedings paper demonstrate solutions to problems that have limited previous scholarship's ability to trace textual transmission across linguistic boundaries, advancing multilingual ancient wisdom text matching beyond existing capabilities.
Our assembly of approximately 430,000 texts specifically focused on tracing ancient wisdom discourse represents the largest corpus ever created for this research domain, enabling questions that could not previously be addressed due to scale limitations. This corpus scale makes possible statistical analysis of cultural transmission patterns at a previously unknown scale.
The systematic approach we have developed for combining and standardising metadata from three major institutional sources, enhanced by comparison with the Universal Short Title Catalogue and semi-automated LLM processing, advances state-of-the-art in bibliographic data integration for historical research. This integrated metadata enrichment pipeline provides a model for future large-scale digital humanities projects requiring cross-institutional data harmonisation.