European Commission logo
English English
CORDIS - EU research results
CORDIS

Knowledge Graph based Representation, Augmentation and Exploration of Scholarly Communication

Periodic Reporting for period 3 - ScienceGraph (Knowledge Graph based Representation, Augmentation and Exploration of Scholarly Communication)

Reporting period: 2022-05-01 to 2023-10-31

Despite an improved digital access to scientific publications in the last decades, the fundamental principles of scholarly communication remain unchanged and continue to be largely document-based. The document- oriented workflows in science have reached the limits of adequacy as highlighted by recent discussions on the increasing proliferation of scientific literature, the deficiency of peer-review and the reproducibility crisis. In ScienceGRAPH we aim to develop a novel principled model for representing, analysing, augmenting and exploiting scholarly communication in a knowledge-based way by expressing and linking scientific contributions and related artefacts through semantically rich, interlinked knowledge graphs. The model is based on deep semantic representation of scientific contributions, their manual, crowd-sourced and automatic augmentation and finally the intuitive exploration and interaction employing question answering on the resulting ScienceGRAPH base. Currently, knowledge graphs are still confined to representing encyclopaedic, factual information. ScienceGRAPH advances the state-of-the-art by enabling to represent complex interdisciplinary scientific information including fine-grained provenance preservation, discourse capture, evolution tracing and concept drift. Also, we will demonstrate that we can synergistically combine automated extraction and augmentation techniques, with large-scale collaboration to reach an unprecedented level of knowledge graph breadth and depth. As a result, we expect a paradigm shift in the methods of academic discourse towards knowledge-based in- formation flows, which facilitate completely new ways of search and exploration. The efficiency and effectiveness of scholarly communication will significant increase, since ambiguities are reduced, reproducibility is facilitated, redundancy is avoided, provenance and contributions can be better traced and the interconnections of research contributions are made more explicit and transparent.
Already in the first reporting period a substantial number of publications have been produced and published at renowned venues. Furthermore, a publically available demonstrator for the research has been implemented.
WP1 Deep Scholarly Knowledge Representation: We developed a conceptual model for representing scholarly contributions in knowledge graphs, which was published at the renowned JCDL 2020 conference . This work concluded MS1.2 Scholarly knowledge representation model and is currently implemented in MS1.3 Scalable graph storage.
WP2 Scholarly Knowledge Extraction, Graph Completion & Recommendation: We have successfully driven-forward work towards MS2.1 Knowledge graph-based information extraction by leveraging natural language processing methods for knowledge extraction from scientific literature resulting in a number of conference publications at LREC , ICADL and ECIR .
WP3 Knowledge-graph-based Communication & Interaction: We have devised and implemented work regarding MS3.1 Adaptive curation methods and published the results at JCDL with regard to the creation and curation of state-of-the-art comparisons and ICADL regarding importing and curating tables from survey articles.
WP4 Knowledge Graph Exploration & Question Answering has not yet formally started, but some preliminary work was done regarding simple question answering on state-of-the-art comparisons and published at TPDL .
WP5 ScienceGRAPH Testbed: We have started demonstrating the semantic organization of scholarly contributions leveraging knowledge graphs in the Open Research Knowledge Graph which we develop together with project partner TIB and is publically available at https://orkg.org. In particular, we have created state-of-the-art comparisons for more than 1.000 research problems (including regarding the COVID-19 pandemic), devised the concept of domain-specific observatories, and published first results in this regard.
The transfer of knowledge has not changed fundamentally for many hundreds of years: It is usually document-based - formerly printed on paper as a classic essay and nowadays as PDF. With around 2.5 million new research contributions every year, researchers drown in a flood of pseudo-digitized PDF publications. As a result research is seriously weakened. In ScienceGRAPH, we argue for representing scholarly contributions in a structured and semantic way as a knowledge graph. The advantage is that information represented in a knowledge graph is readable by machines and humans. As an example, we give an overview on the Open Research Knowledge Graph (ORKG), a service demonstrating and implementing this approach. For creating the knowledge graph representation, we rely on a mixture of manual (crowd/expert sourcing) and (semi-)automated techniques. Only with such a combination of human and machine intelligence, we can achieve the required quality of the representation to allow for novel exploration and assistance services for researchers. As a result, a scholarly knowledge graph such as the ORKG can be used to give a condensed overview on the state-of-the-art addressing a particular research quest, for example as a tabular comparison of contributions according to various characteristics of the approaches. Further possible intuitive access interfaces to such scholarly knowledge graphs include domain-specific (chart) visualizations or answering of natural language questions.
Comparison of studies on the basic reproduction rate of COVID19 in the Open research Knowledge Graph