Periodic Reporting for period 1 - AI-DAPT (AI-Ops Framework for Automated, Intelligent and Reliable Data/AI Pipelines Lifecycle with Humans-in-the-Loop and Coupling of Hybrid Science-Guided and AI Models)
Reporting period: 2024-01-01 to 2025-02-28
In its quest for automation, AI-DAPT will design a novel AI-Ops / intelligent pipeline lifecycle framework cross-cutting the different business, legal/ethics, data, AI logic/models, and system requirements while always ensuring a human-in-the-loop (HITL) approach. Taking into consideration the dual “data and model” design, development and operation perspectives that need to be gracefully and effectively brought together, AI-DAPT will establish the underlying methodological and technical foundations across different axes that will complementarily and interactively work together as follows: A. AI/Data Axis (addressing Challenges 1-7). AI-DAPT shall adopt novel automated approaches, fused with targeted human-in-the-loop aspects, to improve “data for AI” pipelines in a systematic and scalable way. Typical data management activities including (but not limited to) data definition, preparation, annotation, cleaning, manipulation, synthetic generation and observability will be revamped through AI-driven automation that further leverages Explainable AI (XAI) techniques to ensure human interaction and informed intervention, whenever required for taking the final decisions regarding the pipeline configuration, as well as for ensuring the quality of the data and the ethical use of the underlying AI. Through the AI-DAPT data-centric AI research, the raw datasets will be revamped into appropriate, added value, up-to-date and reusable features, that effectively reduce the “time to insights”; B. AI/Model Axis (addressing Challenges 8-11). AI-DAPT shall explore and promote hybrid science-AI solutions, bringing together data-driven AI models and science-based first-principles models, that build on high-quality and reliable data. AI-DAPT practically introduces targeted automation interventions on the AI model building, training, validation and observability steps, that shall allow continuous, dynamic AI improvements by diminishing the “time to detection” and “time to resolution” for any AI pipeline problem, adjusting the training on-the-fly and ensuring that AI will work reliably well across diverse production environments and settings.
In order to demonstrate the actual innovation and added value that can be derived through the AI-DAPT scientific advancements, the AI-DAPT results will be validated in two, interlinked axes: I. Through their actual application to address real-life problems in four (4) representative industries that are characterized by a varying degree of AI maturity: (a) Health, (b) Robotics, (c) Energy, and (d) Manufacturing; II. Through their integration in different AI solutions, either open source (e.g. Jupyter, Acumos AI) or commercial (S5 Enterprise Analytics Suite, Qlik, etc.). The purpose of such an integration is to demonstrate that the AI-DAPT results bring added value within the established AI market landscape.
STO.1: To set and implement the underlying foundations for data-centric trustworthy AI solutions through reliable and interoperable pipelines that consistently automate the end-to-end data/AI management processes and effectively fuse first-principles scientific models and data-driven AI/machine learning models.
Our main achievements are:
-Thorough understanding of end-user needs, and system requirements.
-Design of AI pipeline lifecycle.
-Setup an ethics-by-design approach to handle critical ethics issues in demonstrators.
-Deployment of a dynamic research and technology radar.
-Defining the project's research agenda.
STO.2: To design and deliver appropriate automated and XAI-based “Data for AI” techniques cross-cutting the data pipeline lifecycle operations, from data mining/harvesting, investigation, documentation, annotation, and cleaning to fit-for-purpose data valuation and synthetic data generation, to allow accruing the right value for the right data at the right time.
Our main achievements are:
-Requirements elicitation and components internal architecture drafting.
-Foundation for data lifecycle services.
-Tool design and prototyping
-Software component interoperability.
-Early integration with monitoring and workflow tools.
STO.3: To develop rigorous automated, fair and trusted “Hybrid Science-guided AI Models” techniques cross-cutting the AI pipeline lifecycle operations, from interactive training, explainability and evaluation to continuous delivery and observability to provide constantly accurate, data-driven and scientifically consistent insights in an energy-efficient manner.
Our main achievements are:
-Alignment with user needs/WP1
-State-of-the-art methodologies assessment.
-Tool and service specifications.
-Initial architectural designs of components.
-AI Pipeline execution and automation framework design.
-Governance and ethical AI foundations.
STO.4: To serve automated, end-to-end data/AI pipelines through the novel, integrated AI-DAPT AI-Ops platform based on easily deployable and scalable services, enabling efficient, trustful, reliable & interoperable exchanges with a wealth of sources, systems and data spaces.
Our main achievements are:
-Reference architecture and integration planning.
-User-centric workflows and interactions.
-Integration planning and CI/CD pipelines.
-Early software validation strategy designed.