Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Graphs and Ontologies for Literary Evolution Models

Periodic Reporting for period 1 - GOLEM (Graphs and Ontologies for Literary Evolution Models)

Reporting period: 2023-01-01 to 2025-06-30

The goal of the project is to create accurate models of how the (formal and content-related) cultural traits of fiction spread and combine. The data used are (fan)fiction stories in 5 different languages (English, Spanish, Italian, Korean, and Indonesian) gathered from various online platforms. The methodology mainly combines computational literary studies and cultural evolution theory, with influences from fan studies and information science. Millions of stories are shared on online platforms such as Wattpad, AO3, and Fanfiction.net combined with readers’ reactions and comments on these stories. The GOLEM project will analyze stories and their responses gathered from sites in five different languages – English, Spanish, Italian, Korean and Indonesian. This analysis can provide a wealth of information about the characters in a story, the genre, what a story is about, how a story is constructed, what themes are covered, as well as what readers from different countries and cultures find important in a story. What (linguistic, stylistic, thematic) elements in a story become popular in different languages? What makes a story get read, and what do readers value in a story? The information we collect with this research makes it possible, with the help of computer models, to find answers to these kinds of questions. The goal is to test hypotheses about cultural evolution and develop a methodology that can also be applied to books from other periods in history. In this way, we can study the evolution of fiction over the centuries, and gain unprecedented insight into something as old as humanity itself: storytelling
The first two years of the project focused on the creation of the knowledge graph database (WP1) and the metadata generation (WP2). More specifically:
• Task 1.1: we identified cultural traits that could be added as classes of the GOLEM ontology and defined the details of the formal ontology for narrative and fiction.
• Task 1.2: we collected data from four data sources and shared part of it in a triple store in the form of derived data.
• Task 1.3: we populated the knowledge graph with metadata coming from the original data collection and started to expand it with additional information extracted from the full text of the stories.
• Task 2.1: the research team manually annotated texts and created triples to be added to the knowledge graph.
• Task 2.2: we completed the guidelines for the annotation of narrative events and characters’ traits that will be later used in the crowdsourced annotation of texts.
• Task 2.3: we extracted triples automatically from unstructured text via machine learning and NLP techniques, focusing on narrative events and characters participating in them.
• Task 2.4: we extracted triples automatically from infoboxes on the wiki Fandom.com.

The major achievements of the project so far are:
- The GOLEM Ontology for Narrative and Fiction. Wiki: https://github.com/GOLEM-lab/golem-ontology(opens in new window). Ontology of fiction and narrative, developed as an extension of CIDOC-CRM and LRMoo, and aligned to DOLCE-Lite-Plus. Formal ontologies provide a structured and systematic approach to representing the essential elements of storytelling. By capturing relevant concepts, constraints, and interrelationships among narrative elements, this ontology ensures a consistent and explicit representation of the narrative domain.
- "Event Detection between Literary Studies and NLP: A Survey, a Narratological Reflection, and a Case Study". https://doi.org/10.26083/tuprints-00030150(opens in new window). Due to the lack of consensus on the definition of event within and across domains, previous works demonstrate a wide range of approaches and applications of automated event detection. We give an overview of how previous works differ from each other, and how our model relates to it. We also compare our model to a storyline analysis framework developed for news. We show how our model is applicable on news as well.
- "The GOLEM-Knowledge Graph and Search Interface: Perspectives into Narrative and Fiction". https://ceur-ws.org/Vol-3834/paper80.pdf(opens in new window). A user-friendly interface and access point that allows to browse the knowledge graph even without knowledge of SPARQL, offering different perspectives into content-related data and metadata from the domain of fanfiction narratives.
- "The GOLEM Triple Store: A Graph-Based Representation of Narrative and Fiction". https://anr-kflow.github.io/semmes/papers2024/SEMMES_2024_paper_3.pdf(opens in new window). This triple store is the first step towards a large-scale knowledge-graph for stories, as well as characters and events in narratives. It contains more than 8 million stories collected from the Archive of Our Own (AO3), providing scholars with a tool to derive unique insights into fan narratives and storytelling trends over time.
- Ontologies for Narrative and Fiction Workshop. Full report: https://jvmg.iuk.hdm-stuttgart.de/2023/07/17/presenting-at-the-ontologies-for-narrative-and-fiction-workshop/(opens in new window); program and slides: https://golemlab.eu/news/ontology-workshop/(opens in new window). This event brougt together a spectrum of expertise and experience with modelling themes, genres, narratives and characters in fiction using an array of approaches. We started exploring the potential interoperability of these models and the gaps, generating insight that informed the development of the GOLEM ontology.
The ontology for narrative and fiction represents a significant advancement beyond the state-of-the-art by being deeply rooted in literary theory and narratology, ensuring a conceptually rigorous framework for modeling narrative structures. Unlike existing approaches that are often tailored to specific domains or applications, this ontology provides a standardized yet flexible representation that aligns with established theoretical perspectives in literary studies. Moreover, its design enables comparative analysis across languages, facilitating cross-linguistic research in narrative structures and expanding the applicability of computational methods in literary studies. This interdisciplinary and theoretically grounded approach enhances both the precision and scope of narrative modeling, making it a valuable tool for researchers in multiple fields. Moreover, it facilitates the overcoming of the English-only bias of much research in NLP and cultural analytics.
Traditional quantitative and probabilistic methods in literary studies often struggle to fully capture the semantic depth and qualities of texts. These approaches, while useful for statistical analysis, lack the capacity to formally represent the intricate relationships and underlying structures that define narratives. In response to these limitations, ontology-based modeling has emerged as a powerful methodological innovation, enabling the explicit and computational representation of narrative elements. Its modular architecture not only enhances analytical precision but also facilitates interdisciplinary applications, bridging the gap between humanities research and computational techniques, which can be also applied to other domains such as that of news and historical analysis. Through this innovative approach, GOLEM lays the groundwork for a more nuanced and systematic study of narrative structures in both traditional and digital media.
GOLEM project logo
My booklet 0 0