Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Tracing knowledge graph provenance from textual knowledge sources

Project description

Solving challenges in knowledge graph provenance

In the digital era, Knowledge Graphs (KGs) organise vast amounts of information, crucial for various applications like disease diagnosis and drug discovery. However, the reliability of KG knowledge, primarily sourced from text, poses a challenge. Verifying the origin of this knowledge, known as provenance, is essential but difficult. With the support of the Marie Skłodowska-Curie Actions programme, the KG-PROVENANCE project aims to develop efficient models to detect and validate the origins of KG knowledge. Specifically, it will address the critical need for efficient KG provenance detection models. By tackling scalability issues through innovative subsampling methods and developing a dynamic architecture to align knowledge shifts in text with KG updates, the project promises groundbreaking solutions.

Objective

Knowledge Graphs (KGs) play a vital role in modern computer systems by organizing information efficiently through structured relations between concepts or entities. They provide a structured framework for storing and retrieving information, facilitating easier navigation and analysis of large volumes of data. This is crucial in interdisciplinary knowledge-intensive applications like disease diagnosis, drug discovery, ecological data interpretation, and specialized search engines. The knowledge in KGs is predominantly derived from unstructured textual sources, such as scientific articles and news feeds. However, verifying the origin of KG knowledge in these textual sources, known as the provenance of KG knowledge, is currently challenging. Provenance detection is essential for explaining and validating the knowledge stored in KGs and identifying potential inconsistencies with textual sources. To address the lack of efficient KG provenance detection models, my method will tackle two major scientific challenges. Firstly, dealing with a large volume of text as a source of information requires significant computational power, which poses a scalability problem. To overcome this, I will design subsampling methods to focus only on the most relevant textual passages that represent the knowledge in a KG. Secondly, the scalability problem is further complicated by the dynamic and evolving nature of knowledge, with millions of new textual sources appearing daily. This presents a challenge in efficiently identifying textual sources that contribute to knowledge shifts and using them as provenance to define KG updates. To address this, I will develop a novel scalable architecture to efficiently align knowledge shifts in text to concrete changes in KGs. Finally, I will closely collaborate with interdisciplinary industrial researchers to demonstrate the effectiveness of the developed methodology in real-world scenarios.

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

HORIZON-TMA-MSCA-PF-EF - HORIZON TMA MSCA Postdoctoral Fellowships - European Fellowships

See all projects funded under this funding scheme

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

(opens in new window) HORIZON-MSCA-2023-PF-01

See all projects funded under this call

Coordinator

AARHUS UNIVERSITET
Net EU contribution

Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.

€ 230 774,40
Address
NORDRE RINGGADE 1
8000 Aarhus C
Denmark

See on map

Region
Danmark Midtjylland Østjylland
Activity type
Higher or Secondary Education Establishments
Links
Total cost

The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.

No data

Partners (1)

My booklet 0 0