Periodic Reporting for period 3 - KITAB (Exploring Cultural Memory in the Pre-Modern Islamic World (700–1500):Knowledge, Information Technology, and the Arabic Book)
Período documentado: 2021-05-01 hasta 2022-10-31
Through our work, we are advocating generally for a new way of working on the history and historiography of the Middle East and, by analogy, any large historical textual tradition. The way involves computational textual analysis of corpora, or collections of texts, that are vetted and prepared by scholars and readable by a computer. It specifically involves using algorithms to detect common passages between pairs of such texts and then finding patterns. Done forensically and critically, academics can aspire and work towards creating a map that represents the intertextuality of any entire, surviving written textual tradition.
The creation of the corpus represents an important step for the field. A team member is focused on mapping the corpus to printed editions and manuscripts in catalogues today. This mapping is important for considerations of cultural memory, for understanding what is in our corpus and what is not, and indeed, for undertsanding any historical text – whether for digital or non-digitally informed research. That is because it puts into sharp focus the partiality of our access to the past, and helps to show how later periods profoundly mediated our access to earlier ones.
In terms of technical method and platform, we have adapted for Arabic a text reuse algorithm called passim, which finds and aligns common parts of texts from across the corpus. We have also developed machine learning methods to automatically detect transmissive chains (isnads), a characteristic feature of Arabic texts. We are working to produce a platform that will feature applications allowing researchers to access and interact with our data. Currently Arabic does not enjoy the same functionality as English and European languages when it comes to text readers. Our work on text reuse, search, and named entity recognition is important for our research goals (understanding how texts were copied and recopied across time) but also relevant for reading interfaces. Through complementary funding, we have created a beta-version text reader in Arabic that uses our data.
In terms of book history, our ongoing case studies focus on authorial practices, changing forms of the book, and narrative adaptations. Team members have given 43 research presentations, published 22 blogs, and have 5 papers submitted for publication that relate to book history.
A second book, co-written by the project's research team (including the PI), explores the working of cultural memory. We work both with the dataset as a whole (including analysis of the corpus itself) and with case studies that link books and authorial practices to specific memory communities.
Additional books, book chapters, and articles are based on the research team's own case studies. These rely on our corpus, data, and methods, and go into greater detail on particular case studies.
A key outcome of the project will be not just our scientific findings, but a community of scholars working in Arabic who want to contribute to, and use, our corpus, data, and methods. The past 5 years have witnessed major advances in machine learning. Because of our ERC funding, the KITAB project has been able to partner with computer scientists to advance machine learning for historical Arabic, build the corpus, and initiate networks that can further develop after the project ends. We have begun working closely with a user group of external, early career scholars who we are training to access and contribute to our work. Both the humanities scholars and computer scientists profit by working together, and a key achievement will be working out how to create such successful partnerships. We are now also advising and working closely with scholars working in Persian, who wish to undertake work similar to ours.