Periodic Reporting for period 2 - AquaINFRA (Infrastructure for Marine and Inland Water Research)
Reporting period: 2024-01-01 to 2025-12-31
The AquaINFRA Data Space with Data Discovery and Access Service (DDAS) provides access to a seamless pan-European hydrography and facilitates the true integration between the marine and freshwater sciences.
Researchers experiencing the advantage of open and FAIR principles in the AquaINFRA use cases, has lead to further adoption of this principle, contributing to making open and FAIR data the new norm.
The uptake of Open Science practices seems to increase, when FAIR data from a wide range of data providers can be handled through the AquaINFRA Data Space and DDAS - making Open Science services and workflows available through a user-friendly interaction platform. Next step will be the integration within the new node structure of the EOSC federation.
The expected wider scientific, economic and societal effects of the project contributing to the expected impacts outlined in the respective destination in the work programme includes a transformation in the way researcher share and exploit data, which are being obtained by fully transparent and user-friendly access to analyses and the underlying data in the AquaINFRA infrastructure and services. Researchers will then spend less time looking for data and be able to easily publish their work in a fully reproducible manner.
The AIP assists users in finding, inspecting, and selecting relevant data as an input for an analysis in the VRE. The idea of the VRE is to have a web-based application where users can reuse existing tools, contribute with newly developed tools, and connect them to create readily shareable and reproducible workflows.
The VRE is based on Galaxy, a freely available online platform allowing users to analyse data, create workflows, and share such data and workflows with other Galaxy users and Zenodo. Tutorials on how to utilise the system are available on the frontpage of the AIP from where Data-to-Knowledge-Packages (D2K) stored at Zenodo, can also be accessed.
The AquaINFRA instance has been created at the Galaxy platform, and the Aqua’s Galaxy has been tested and form a key element in the AquaINFRA architecture.
The Hydrography dataset with integrated data, introduced the seamless connectivity framework within the AquaINFRA project. The framework focuses on establishing the connectivity between streams and rivers, as well as lakes, and how freshwater water bodies are connected to the coast and marine realm.
The connectivity framework sets the basis for subsequent aquatic data processing and analysis services across Europe. This seamless connectivity framework provides the foundation for reproducible freshwater geospatial analysis services to the research communities across the aquatic research disciplines.
Datasets will continuously be made available throughout the project, to address evolving needs while working with the use cases. During the socio-economic and biodiversity datasets and their integration to the AquaINFRA data ensemble, the goal is to enable users to capitalise on all these datasets seamlessly.
The use cases has formed the basis for the co-development of workflows has resulted in 41 AquaINFRA processing services / Galaxy tools
Based on user needs, Open Educational Resources (OERs) tailored to the AquaINFRA research infrastructure to enable community engagement and capacity building has been developed.
Explorations and negotiations are going on in order to find the best way to ensure long term sustainability of the AquaINFRA solutions.
Together with the prototypes for the AquaINFRA interaction platform, this paves the road for the futher development of easier and seamless access to relevant data covering the whole ocean and freshwater continuum and including environmental as well as socioeconomic data, which will take AquaINFRA beyond the state of the art.
During the second reporting period, this was followed by a MVP release and approval of the AquaINFRA Data Space (milestone 3 in month 18) and the number of accessible metadata and data providers connected to indexed, and discoverable by the AquaINFRA Data Discovery and Access Service (DDAS) has now reached >90 million metadata records (target by project end: 10):
• 96 vector and raster datasets in the AquaINFRA Data Lake
• 22 external metadata catalogues connected to DDAS
Nine use cases have been further developed within the four case study regions with a specific focus on how to bridge the freshwater and marine water domains and how to integrate socio-economic data. Based on the use cases, workflows demonstrating how to utilise the components of the AquaINFRA research environment have been developed and made available on the AquaINFRA Galaxy platform.
During the second reporting period, 41 processing services have been developed and tested in the AquaINFRA case study areas, so that they are now accessible via Galaxy
Towards the end of the project these will be further upscaled and expanded.