Periodic Reporting for period 1 - QuAre (Question Answering for MonitoRed Fuel Cell systEms)
Berichtszeitraum: 2022-01-01 bis 2023-12-31
The significance of this project is twofold: a) A more generic one for any health monitoring system: as it introduces a novel AI knowledge-based methodology that is aligned with the trustworthiness and FAIR (Findable Accessible Interoperability Reuse) principles, b) A more specific one focused on the fuel cell diagnostics: The recently proposed strategic technologies for Europe platform (STEP) includes clean technologies as one of its three priority areas. In terms of energy, these clean technologies include, among others, also fuel cells. Hence, their good performance is vital for climate, economy and eventually society as well.
The following set of Research Objectives (RO) are identified:
● RO1: Perform spatio-temporal extensions of the data model developed by previous work of the researcher so that the duration of abnormal measurements can be accurately captured and later assessed (see RO2).
● RO2: To radically extend the existing ontology to encapsulate rules related to degradation effects of cumulative faults, long-term storage, start-stop cycling and environmental conditions. Extend the inference engine respectively to enable inferences on spatiotemporal data.
● RO3: To create KG embeddings of the static and streaming data so that manipulation of the KGs is simplified and hence significantly accelerated while keeping the structure of the graphs.
● RO4: To develop techniques and extend the existing system for answering complex factoid and non-factoid questions effectively (with high precision and recall) and efficiently (with very short response times).
● RO5: To employ NLG techniques so that the developed platform will be able to express the justifications of the answers in natural language hence passing tangible and accurate information to the end user.
● RO6: Working with the Fuel Cell system lab hosted by the Aeronautical and Automotive Engineering (AAE) of Loughborough University (LU) to evaluate our tool and to conduct a user study with service engineers
a) Two novel ontologies (data models) were developed: one for system health monitoring in general, and in one for PEM fuel cell system monitoring (PEMFCv2.0). The development of PEMFCv2.0 was based on the literature on PEMFC diagnostics. Then the ontology PEMFCv2.0 was populated with raw and processed data in a streaming manner formulating the dynamic the spatio-temporal knowledge graph PEMFC KG. The development of the ontologies was based on the linked data principles.
b) A reasoning-based pipeline for fuel cell diagnostic, that takes into account potential sensor unreliability. Additionally, it can identify abnormal behaviour of the sensors. The pipeline was tested and evaluated with real sensor data.
c) The ability of transformer-based models to perform diagnostics was also tested with very encouraging results. Specifically, 3 different pre-trained models (DeBERTa, GPT3.5 GPT4) were fine-tuned in a synthetic dataset generated from ontological knowledge bases. The models performed very good, after this with few-shot learning they were trained in fuel-cell data and rules expressed in natural language. The results were very promising.
d) Embedding based techniques were explored for efficient query answering over dynamic knowledge graphs. Initially, we focused on spatial data for which we developed a novel method. However, provided 1) the good results described in c), and 2) the limited expressivity of the queries that can be supported by embedding based techniques, we were refrained from exploring any further this approach.
e) We developed and implemented a methodology for generating questions and queries in natural language. Additionally, two different approaches for translating questions to queries were developed: a template-based and a large-language-model based.
f) The pipeline was further extended by returning the answers in natural language. This was done by leveraging tools existing in the literature.
e) Finally, a novel diagnostic framework was developed for early fuel cell system diagnostics using the aforementioned components which were tested with real fuel cell data and real questions provided from domain experts.
The project results have already been disseminated via presentations in international conferences, technical workshops and peer-reviewed publications. All publications generated data and codes that are openly available. The Fellow has further presented her work in three invited talks disseminating, in this way, her work both in research and in industry. Additionally she has performed two outreach activities in schools.
QuAre developments have been utilized as a baseline in the projects GeoQA (HFRI project) and AI4Copernicus (ICT49 project, funded by Horizon2020). Additionally, the wide applicability of some of the methodologies developed within the QuAre project lead to additional research outputs pertinent to question answering and to automatic dataset generation.
-The diagnostic approach is 100% explainable, due to its knowledge-based nature.
-It is the first time that large language models have been employed for knowledge-based fuel cell system diagnostics. This is a significant output because this means that the end-user can at any time alternate the diagnostic rules by simply using natural language, eliminating in this way the need for IT experts. Additionally, this methodology can be employed for the diagnostics for any system that it is governed by diagnostic rules that can be expressed with the standard ontological languages.
-One of the main challenges introduced by large language models is the need for large datasets. In QuAre the Fellow developed a methodology for the automated generation of large datasets. This approach facilitated the interactions of the researcher with colleagues working in the field of question answering and eventually to additional research outputs pertinent to question answering.
-This is the first question-answering system developed for the interrogation of fuel cell systems
Efficient and accurate fuel cell system diagnosis can lead to longer living fuel cell systems, reducing in this way greenhouse gas emissions and reaching the vision of EU on Climate Action.