Periodic Reporting for period 2 - FAIRiCUBE (F.A.I.R. information cube)
Période du rapport: 2024-01-01 au 2025-09-30
To reach this objective, we are creating the FAIRiCUBE HUB, a crosscutting platform and framework for data ingestion, provision, analysis, processing, and dissemination. The HUB aims to unlock the potential of environmental, biodiversity, and climate data through dedicated European data spaces. Within this project, TRL 7 will be attained, together with governance measures to assure continued maintenance of the FAIRiCUBE HUB beyond the project lifespan.
This project aims to leverage Machine Learning (ML) on multi-thematic datacubes for a broader range of governance and research institutions that currently cannot easily access or utilize these resources. Selected use cases demonstrate how data-driven projects benefit from cube formats, infrastructure, and computational capabilities. These use cases guide the creation of a user-friendly FAIRiCUBE HUB, tightly integrated with European data spaces, offering stakeholders an overview of available data and processing modules. Tools supporting users not deeply familiar with EO and ML to scope requirements and costs of analyses will be implemented, easing uptake by a broader community. FAIR sharing of results is supported through easy-to-use tools and workflows within the FAIRiCUBE HUB.
FAIRiCUBE Hub
We have established the architecture of the FAIRiCUBE Hub, which serves as the project’s cornerstone. The Hub is a crosscutting platform for data ingestion, provision, analysis, processing, and dissemination, enabling the use of environmental, biodiversity, and climate data through European data spaces. Progress has been made in defining its structure and functionality, including identifying core services under the FAIRiCUBE umbrella. Initial steps toward harmonizing and integrating these services ensure usability and accessibility.
Holistic Meta-Data Management Approach
A key objective of FAIRiCUBE has been to streamline the collection, ingestion, alignment, and availability of metadata alongside Earth observation and socio-economic data, addressing a core F.A.I.R. principle. User-friendly routines for data and metadata ingestion have been developed, and efforts extended to include analysis and processing metadata. A semi-automatic monitoring system for computing resources ensures optimal performance. The FAIRiCUBE Knowledge Base (KB) provides unified access to metadata, supporting accessibility and data-driven insights.
F.A.I.R. and Open Documentation of Data Science Work
The FAIRiCUBE Hub serves as the main infrastructure for executing and documenting UC data science work, acting both as a demonstrator and a source of project execution insights. Through the UCs, challenges such as data availability, ingestion bottlenecks, resource requirements, and service interoperability have been identified and addressed, strengthening platform robustness. Technical and scientific UC work is published through dedicated GitHub repositories and FAIRiCUBE communication tools. For machine learning applications, transparent documentation of processes and results represents a significant step forward for open science.
Enabling Data Science Resources for Non-Specialists
FAIRiCUBE is committed to making data science resources accessible beyond specialist communities. This includes guidance for integrating data science into workflows and access to essential resources such as data catalogues, storage, compute, and ML tools. Documentation of UC work supports future projects on the FAIRiCUBE Hub. Some aspects of data science workflows still require technical expertise; while not all subtasks of creating ML-ready data cubes can be automated, documentation and guidelines will assist future users.
This metadata ingestion pipeline demonstrates clear benefits for a wider data science community. Jointly developed by data ingestion and metadata specialists, it defines a baseline set of metadata fields that can serve as a reference standard for other ingestion pipelines.
FAIRiCUBE has initiated the integration of AI ethics assessment into machine learning applications. Through walkthroughs with UC owners and partners, the project has laid the groundwork for embedding socio-technical scenarios into its methodologies, ensuring ethical deployment and validation of AI technologies used in FAIRiCUBE.
Furthermore, FAIRiCUBE and its partner Constructor University (CU) have contributed to ISO and OGC standardization processes. In addition, the rasdaman subcontractor team under CU participated in OGC Testbed-19 activities related to GeoDataCube and Analysis-Ready Data, building on FAIRiCUBE experience.