Periodic Reporting for period 1 - AD4GD (All Data 4 Green Deal - An Integrated, FAIR Approach for the Common European Data Space)
Reporting period: 2022-09-01 to 2024-02-29
The project demonstrates the integration of data from remote sensing, established Virtual Research Environments and Research Infrastructures, Internet of Things (IoT), Citizen Science, and socioeconomic data in an interoperable, scalable and reliable manner in three pilot cases. AD4GD develops semantic mappings to different standards and models and, as a result, brings together domain- and data source-specific semantic concepts such as the Essential Variables framework.
AD4GD has worked on ensuring the FAIR integration of CitSci with other in-situ data and INSPIRE data in the GDDS. Key CitSci projects were identified and the STA was adapted to incorporate the elements to make CitSci data FAIR while allowing for the recognition of the data contributors. Their true integration in the European GDDS relies on the creation of a Citizen Science node for data relevant to the GDDS. AD4GD mapped the heterogeneous IoT data models and communication protocols into STA data model. Innovative interoperability approaches are applied in the pilots to deliver IoT datasets as online services, using the agreed vocabularies for variables.
Copernicus has evolved into the Copernicus Data Space Ecosystem and ECMWF is participating in DestinE data lakes. This should allow for integration of EO in the GDDS in a way that is consistent and synergistic with long-term investments of the EC. The capacity to combine EO with numerical modelling data was set up in HPC. A infrastructure in HPC to execute data analytics, machine learning and AI algorithms in general was prepared. It is ready to assess the quality and trustworthiness of data and consistency among diverse sources of data.
The Sensor Things API service in the AD4GD data space demonstrates a core to edge architecture. On one hand, there is a FROST server that centralises all observations gathered by the sensors. On the other hand the use of sensors attached to Raspberry Pi hardware allows for data processing in the edge, detecting anomalies in the sensed data and sending them directly with SensorThings API MQTT.
AD4GD and the sister projects have contributed to the GREAT project blueprint for the Green Deal Data Space (to have a useful data space and connect it to the European decision-making process) by building the community of actors and stakeholders around the data space (Community of Practice) and defining rules of participation in terms of standards and interoperability arrangements, AD4GD has also interacted with relevant initiatives in the Data Spaces field, such as the DSBA Alliance and the DSSC.
The project has defined the three pilots, their data needs and the way the Green Deal Data Space can be used to develop the pilots. The pilots are:
Biodiversity: Habitat connectivity affects the distribution of species within ecosystems. It is a key biodiversity indicator and, therefore, a cornerstone in European restoration policies. This pilot aims to monitor historical connectivity indices in Catalonia using data coming from remote sensing maps, species occurrence observations and sensor data to enhance regional and local actions. It is focussed on habitat connectivity using satellite data, human observations, IoT technologies and two different algorithms that will be cross-validated.
Air Quality: Poor urban air quality causes serious health issues across Europe. This pilot employs low-cost IoT sensors to enhance accurate and localized air pollution forecasting while involving citizen communities. The goal is to provide detailed local air quality insights that support informed health decisions and help to establish replicable models for urban air quality enhancement. IoT data and in-situ human measurements contribute to enrich the EO data provided by the Copernicus Atmosphere Monitoring service.
Zero pollution in water reservoirs: Berlin’s small lakes are ecological hotspots at risk due to various climate and urban stressors. These lakes also suffer from a lack of information on their condition. This pilot project aims to develop indicators of water quality and availability by integrating IoT sensor data with satellite and citizen science data to improve decision-making at a local level.
* Automated ingestion of air quality observations coming from diverse low-cost sensors. This integration of several data sources in a single one will be applied in the air quality pilot.
* A Data Trustworthiness Framework to validate data and express its quality with QualityML and SensorML is developed and applied to all 3 pilots.
* A Green Deal Information Model is developed as a common vocabulary to provide the basis of a common Green Deal Data Space, enabling interoperability and integration of different systems, potentially from different vendors.
* A Data catalogue with Semantic Uplift using GeoDCAT.
* The OGC RAINBOW as a Web accessible source of information about things (“Concepts”) the OGC defines or that communities ask the OGC to host on their behalf. It applies FAIR principles.
* TAPIS, "Tables from OGC APIs for Sensors" as a client application developed in HTML and JavaScript to support data mobilisation in STAplus.
Scientific results and tools for scientist are:
* A habitat connectivity open data cube as a service to access to the data through Jupyter Notebooks.
* Open graph oriented workflows for habitat connectivity computation.
* Routines to import observations to STA (FROST implementation) and transformation of observations to a standard service.
* Water modelling and prediction that models the status an evolution of water quality and quantity on Berlin lakes.