Knowledge Graphs at Scale

Project Information

KnowGraphs

Grant agreement ID: 860801

DOI

10.3030/860801

Project closed

EC signature date 9 August 2019

Start date 1 October 2019

End date 31 March 2024

Funded under

EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions

Total cost

€ 3 873 641,40

EU contribution

€ 3 873 641,40

3 873 641,40

Coordinated by

UNIVERSITAET PADERBORN
Germany

Periodic Reporting for period 2 - KnowGraphs (Knowledge Graphs at Scale)

Reporting period: 2021-10-01 to 2024-03-31

Knowledge graphs (KGs) are widely regarded as a key enabler for explainable machine learning with over 4B distinct users through Google alone. They are also used by a number of Fortune500 companies to provide key user-facing and backend functionality (e.g. chatbots, product descriptions, recommendations, etc.). However, deploying and using KGs at the core of small and medium-sized businesses or even for personal purpose is still challenging for most of the entities.
The goal of the innovative training network KnowGraphs was to address some of the key challenges related to the representation, extraction, operation and exploitation of KGs. To achieve this goal, the project developed time-efficient and effective representation, extraction, storage, verification and exploitation algorithms for KGs that can be easily employed by large and small companies as well as individuals. The legal implications of these developments as well as real ways to exploit these solutions were also considered. The societal ramifications of the results of this ITN are directly linked with current developments at the interface between data, algorithms and humans both at EU and worldwide level. By making KGs easier to use in practice, the project’s outcomes support the democratization and broadening of their use. Furthermore, by studying the legal consequences of the use of KGs in real-life applications, KnowGraphs’ results support the AI and Data Protection agendas of the EU, especially with respect to explainability, consent and explicit information pertaining to the use of artificial intelligence.

The project resulted in a number of ground-breaking successes. In the area of representation, we were able to show that representing knowledge graphs (KGs) as tensors is a viable path towards time-efficient storage and querying solutions that easily scale up to industry-relevant data volumes. In particular, we also showed that this representation allows to accelerate explainable machine learning on KGs by replacing reasoners-the most important bottleneck in this family of computations. These developments are fully compatible with the graph-based policy framework for usage control and the native provenance management approaches also developed in the project.

The area of extraction was also paved with a number of high-impact results. In addition to providing new datasets for relation extraction, we developed novel algorithms for the generation of KGs from textual data. Here, both pure relation extraction as well as end-to-end knowledge extraction were considered. In parallel, we successfully devised a universal approach for representing language semantics, which has already found its way into industry. The bridge between natural language and KGs was further extended by our novel embedding approaches as well as our innovative fact checking algorithms. The former exploit non-Euclidian geometries to achieve a better modelling of underlying semantics while the later exploit verbal representations of knowledge graphs to fetch information relevant to knowledge graph veracity from external sources.

The extraction results were the foundation for our operations on KGs, which considered the use of knowledge graphs as source of background knowledge for machine learning. We developed neuro-symbolic approaches for learning on knowledge graphs that outperformed the previous state of the art by orders of magnitude with respect to runtime without sacrificing output quality. As required in European law, we also studied the legal implications of this form of learning and operations on KGs in general.

UPB: Our contributions resulted in major improvements of the data quality of knowledge graphs. These can now be made to be significantly better suited for critical applications. Our data storage solution based on a novel data structure and a corresponding treatment of (complex) queries achieves up 1000x faster runtimes without any need for fine-tuning. Our work on explainable machine learning led to neural class expression synthesis, a new family of machine learning techniques for knowledge graphs which are also 1000x more efficient than previous state-of-the-art techniques while remaining explainable.

LUH: The project contributed to the goal of "education for all" by developing a KG-based framework that allows improving the personalisation of learning approaches in smart learning environments.

WU: The ESRs at WU leveraged the formal semantics of RDF graphs and SPARQL to define a semantic usage control policy language that facilitates automated compliance checking. In addition, they extended the state of the art with respect to how and under what conditions Knowledge Graphs can support ideation tasks in the innovation process in new product and service development.

FORTH: Our main contributions pertain to NPCS, a Native Provenance Computation approach for SPARQL queries. NPCS annotates query results with their how-provenance and builds upon provenance semirings. It achieves significant runtime improvement over existing query rewriting algorithms for graphs of billion RDF triples.

UMAAS: We worked on enhancing KG quality by focusing on rule learning and rule mining for error detection. We also developed the first methods to automatically generate metadata for Knowledge Graphs using retrieval-augmented and large language models.

BABELSCAPE: Thanks to the outcomes of this project, we achieved significant advancements towards the provision of valuable and trustable information among the huge amount of data produced everyday by driving research in Knowledge Graph-related domains like Entity Linking, Relation Extraction, and Semantic Parsing beyond the current state of the art.

RUG: RUG researchers studied how to embed data protection by design requirements in KGs and in particular the requirements surrounding consent and its withdrawal when processing personal data in KGs. The findings of this work are indispensable for the exploitation of any KGs processing personal data in the EU. They provide practical guidance for developers, which was missing until now.

USTUTT: We went beyond the state of the art by providing novel geometric approach to embedding hyper-relational KGs and KGs with nested facts. The societal impact of these advancements is profound, potentially revolutionizing the way knowledge is structured and utilized in various AI-driven industries. In addition, our institution conducted a deep investigation into complex query answering performance that highlights the limitations of existing approaches and paves the way for advancing the state-of-the-art by establishing more fine-grained evaluations.

Project Logo

Periodic Reporting for period 2 - KnowGraphs (Knowledge Graphs at Scale)

Download Download the content of the page