Periodic Reporting for period 2 - KnowGraphs (Knowledge Graphs at Scale)
Reporting period: 2021-10-01 to 2024-03-31
The goal of the innovative training network KnowGraphs was to address some of the key challenges related to the representation, extraction, operation and exploitation of KGs. To achieve this goal, the project developed time-efficient and effective representation, extraction, storage, verification and exploitation algorithms for KGs that can be easily employed by large and small companies as well as individuals. The legal implications of these developments as well as real ways to exploit these solutions were also considered. The societal ramifications of the results of this ITN are directly linked with current developments at the interface between data, algorithms and humans both at EU and worldwide level. By making KGs easier to use in practice, the project’s outcomes support the democratization and broadening of their use. Furthermore, by studying the legal consequences of the use of KGs in real-life applications, KnowGraphs’ results support the AI and Data Protection agendas of the EU, especially with respect to explainability, consent and explicit information pertaining to the use of artificial intelligence.
The area of extraction was also paved with a number of high-impact results. In addition to providing new datasets for relation extraction, we developed novel algorithms for the generation of KGs from textual data. Here, both pure relation extraction as well as end-to-end knowledge extraction were considered. In parallel, we successfully devised a universal approach for representing language semantics, which has already found its way into industry. The bridge between natural language and KGs was further extended by our novel embedding approaches as well as our innovative fact checking algorithms. The former exploit non-Euclidian geometries to achieve a better modelling of underlying semantics while the later exploit verbal representations of knowledge graphs to fetch information relevant to knowledge graph veracity from external sources.
The extraction results were the foundation for our operations on KGs, which considered the use of knowledge graphs as source of background knowledge for machine learning. We developed neuro-symbolic approaches for learning on knowledge graphs that outperformed the previous state of the art by orders of magnitude with respect to runtime without sacrificing output quality. As required in European law, we also studied the legal implications of this form of learning and operations on KGs in general.
LUH: The project contributed to the goal of "education for all" by developing a KG-based framework that allows improving the personalisation of learning approaches in smart learning environments.
WU: The ESRs at WU leveraged the formal semantics of RDF graphs and SPARQL to define a semantic usage control policy language that facilitates automated compliance checking. In addition, they extended the state of the art with respect to how and under what conditions Knowledge Graphs can support ideation tasks in the innovation process in new product and service development.
FORTH: Our main contributions pertain to NPCS, a Native Provenance Computation approach for SPARQL queries. NPCS annotates query results with their how-provenance and builds upon provenance semirings. It achieves significant runtime improvement over existing query rewriting algorithms for graphs of billion RDF triples.
UMAAS: We worked on enhancing KG quality by focusing on rule learning and rule mining for error detection. We also developed the first methods to automatically generate metadata for Knowledge Graphs using retrieval-augmented and large language models.
BABELSCAPE: Thanks to the outcomes of this project, we achieved significant advancements towards the provision of valuable and trustable information among the huge amount of data produced everyday by driving research in Knowledge Graph-related domains like Entity Linking, Relation Extraction, and Semantic Parsing beyond the current state of the art.
RUG: RUG researchers studied how to embed data protection by design requirements in KGs and in particular the requirements surrounding consent and its withdrawal when processing personal data in KGs. The findings of this work are indispensable for the exploitation of any KGs processing personal data in the EU. They provide practical guidance for developers, which was missing until now.
USTUTT: We went beyond the state of the art by providing novel geometric approach to embedding hyper-relational KGs and KGs with nested facts. The societal impact of these advancements is profound, potentially revolutionizing the way knowledge is structured and utilized in various AI-driven industries. In addition, our institution conducted a deep investigation into complex query answering performance that highlights the limitations of existing approaches and paves the way for advancing the state-of-the-art by establishing more fine-grained evaluations.