The key developments within the scope of QuAre project are identified as follows:
a) Two novel ontologies (data models) were developed: one for system health monitoring in general, and in one for PEM fuel cell system monitoring (PEMFCv2.0). The development of PEMFCv2.0 was based on the literature on PEMFC diagnostics. Then the ontology PEMFCv2.0 was populated with raw and processed data in a streaming manner formulating the dynamic the spatio-temporal knowledge graph PEMFC KG. The development of the ontologies was based on the linked data principles.
b) A reasoning-based pipeline for fuel cell diagnostic, that takes into account potential sensor unreliability. Additionally, it can identify abnormal behaviour of the sensors. The pipeline was tested and evaluated with real sensor data.
c) The ability of transformer-based models to perform diagnostics was also tested with very encouraging results. Specifically, 3 different pre-trained models (DeBERTa, GPT3.5 GPT4) were fine-tuned in a synthetic dataset generated from ontological knowledge bases. The models performed very good, after this with few-shot learning they were trained in fuel-cell data and rules expressed in natural language. The results were very promising.
d) Embedding based techniques were explored for efficient query answering over dynamic knowledge graphs. Initially, we focused on spatial data for which we developed a novel method. However, provided 1) the good results described in c), and 2) the limited expressivity of the queries that can be supported by embedding based techniques, we were refrained from exploring any further this approach.
e) We developed and implemented a methodology for generating questions and queries in natural language. Additionally, two different approaches for translating questions to queries were developed: a template-based and a large-language-model based.
f) The pipeline was further extended by returning the answers in natural language. This was done by leveraging tools existing in the literature.
e) Finally, a novel diagnostic framework was developed for early fuel cell system diagnostics using the aforementioned components which were tested with real fuel cell data and real questions provided from domain experts.
The project results have already been disseminated via presentations in international conferences, technical workshops and peer-reviewed publications. All publications generated data and codes that are openly available. The Fellow has further presented her work in three invited talks disseminating, in this way, her work both in research and in industry. Additionally she has performed two outreach activities in schools.
QuAre developments have been utilized as a baseline in the projects GeoQA (HFRI project) and AI4Copernicus (ICT49 project, funded by Horizon2020). Additionally, the wide applicability of some of the methodologies developed within the QuAre project lead to additional research outputs pertinent to question answering and to automatic dataset generation.