Project description
Digital analysis of Arabic written tradition
The premodern Arabic textual tradition (700-1800) is one of the most prolific in history. Advanced digital technology now allows for the analysis of these texts, identifying similarities beyond simple word matches, including paraphrasing and translation. The ERC-funded KITAB-Transform project aims to reveal hidden connections between texts and shed light on the attitudes and practices that have shaped the Arabic written tradition, influencing Islamicate societies’ cultural memory. It will identify and document transformational reuse across thousands of Arabic books using a growing corpus of digitised texts supported by an OCR pipeline. The project will produce interdisciplinary publications and develop a web application for scholars to easily access the data. This initiative will significantly contribute to the history of Arabic books.
Objective
The premodern Arabic textual tradition between 700 and 1800 is one of the most prolific in human history, with the number of surviving works surpassing the extant production of any other pre-print culture except perhaps imperial China or Sanskritic India. These texts can now be studied in completely new ways using pioneering digital technology that can identify textual units displaying what one scholar has termed ‘extended textual similarity’. This means not only word-for-word matching, which can be detected by existing techniques and has been computationally studied, but also textual overlap obscured by transformations such as paraphrase and translation. By identifying, documenting, and interpreting this kind of transformational reuse across thousands of Arabic books, KITAB-Transform’s team of eight historians, linguists and computer scientists will uncover hitherto obscured connections between texts and shed light on the writerly attitudes and practices that created the Arabic written tradition and continue to shape Islamicate societies’ reservoirs of memory today. The project relies on a vast corpus of digitised texts that continues to grow thanks to an OCR pipeline developed by team members. KITAB-Transform will produce interdisciplinary monographs and edited volumes, journal articles, blog posts, language models, and data sets, as well as a web application through which less technically inclined scholars can access and work with our data, supported by a robust outreach programme. Our work will make an important contribution to Arabic book history and help redress a significant imbalance in the development of large language models, which favour English and modern social media but are comparatively impoverished for historical languages.
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
This project has not yet been classified with EuroSciVoc.
Be the first one to suggest relevant scientific fields and help us improve our classification service
You need to log in or register to use this function
Keywords
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Programme(s)
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
-
HORIZON.1.1 - European Research Council (ERC)
MAIN PROGRAMME
See all projects funded under this programme
Topic(s)
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Funding Scheme
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
HORIZON-ERC - HORIZON ERC Grants
See all projects funded under this funding scheme
Call for proposal
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
(opens in new window) ERC-2024-ADG
See all projects funded under this callHost institution
Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.
N1C 4DN London
United Kingdom
The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.