Knowledge graphs are a family of data structures that are used in an increasing number of user-facing applications. They are now widely regarded as a key enabler for explainable machine learning for the masses. However, the knowledge graphs used in real applications are often large, inconsistent and incomplete. Hence, they come with three main challenges. First, they are challenging to store and query, especially when analytical queries are required to gather high-value data. Second, standard inference engines cannot be used on them, as they simply terminate their computation when faced with inconsistencies. Third, their incompleteness often leads to downstream applications remaining unaware of highly relevant assertions. Finally, their non-vectorial representation makes it challenging to deploy classical machine learning techniques for analysis or exploitation.
The goal of ENEXA was to address these challenges. To achieve this goal, the project specified, implemented, and evaluated novel extraction, storage, inference, and explainable machine learning techniques for knowledge graphs at large scale. All approaches in the project were evaluated in knowledge graphs with sizes of 1 billion triples or more. Our results clearly show that our approaches are now ready to empower small and large European enterprises to use knowledge graphs reliably as cornerstone for their data strategy.