Periodic Reporting for period 2 - EUXDAT (European e-Infrastructure for Extreme Data Analytics in Sustainable Development) Reporting period: 2019-05-01 to 2020-10-31 Summary of the context and overall objectives of the project There are more and more data sources available, generating tons of data each day that could be exploited in different areas to improve the way we do things or to understand our environment much better. On the other hand, we are in a context in which sustainability is a key aspect for humanity, in general, and Europe, in particular, as the population grows, and the environment needs to be protected with respect to its over-exploitation. Agriculture is a sector which as an important impact in our environment (especially in rural areas) and also in food security aspects for our (growing) society and, therefore, we have the opportunity to use all those huge amounts of data in order to perform analytics with novel techniques. Such analytics will help us to take care of our environment (and the soil) while we optimize how we use it, maintaining food demands and provision balanced.As a result, EUXDAT proposes solutions for crops monitoring, for improving land use maps, for taking better decisions on the crops to exploit in certain types of soil and for improving the management of farms in general.EUXDAT, therefore, is an e-Infrastructure for Large Data Analytics-as-a-Service, which aims at bringing heterogeneous data sources (Copernicus, climate data, sensors data, UAVs data, machinery data, land use data, hydrology data, etc…) together with advanced data analysis tools, which can make use of both Cloud and HPC resources, in order to process huge amounts of data which will be useful for supporting agriculture. Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far The second period of the project was focused on the development of the requested features and on the implementation and validation of the scenarios defined, using v1 of the e-Infrastructure as the base for the new works.The implementation of the EUXDAT e-Infrastructure followed the same approach as in the first period. First of all, in the context of WP2, the requirements and high-level architecture were updated. The previous experience with the v1 of the platform and with the first version of the scenarios, as well as some external feedback collected (i.e. through the hackathons and other interactions with stakeholders) provided the main points for the update of features and architecture. Also, two new scenarios were identified for v3 of the e-Infrastructure, although only one of them was implemented in the end.Based on the definitions and the updates, v2 and v3 of the EUXDAT e-Infrastructure progressed in the implementation of the main features and the scenarios.From the end users’ platform perspective (WP3), the implementation included more data connectors (i.e. FTP repository for hyperspectral images, PESSL API, etc.). Also, more libraries were made available for the prototyping environment (Jupyter Notebooks), in line with the needs pointed out by the scenarios and potential utility. The frontend was improved in an important way for accessing easily all the features (including accounting) and for providing a rich GUI for scenarios. The marketplace solution was changed and adapted properly, and both the marketplace and the data catalogue were populated.The background infrastructure was also further developed (WP4). The tools for enabling easy data movement were set up and integrated in the orchestration mechanism. The applications were defined in such a way they could use HPC and Cloud in an optimal way. The SLA management was set up and the Quality of Service parameters were defined (as well as the SLA templates). Finally, an accounting component was developed, linking it with the orchestrator, the monitoring and the SLA manager. As for the scenarios, scenarios 1, 2 and 3 were further developed (i.e. HPC usage in the case of scenarios 1 and 3 with new features, usage of the hyperspectral images in scenario 2). Other three scenarios have been developed in v2 and v3: ‘Crop climate risk analysis’, ‘Information support for field use recommendations’ and ‘Crop weather risk monitoring and prediction’. They all have been integrated with the rest of components. The test cases for all components have been defined and executed in order to carry out the TRL qualification.The dissemination plan has been followed, with participation in social networks, the organization of hackathons, the publication of scientific papers, periodic publication of blogs, etc…Finally, the exploitation related tasks went on with the market analysis and the selected business models were defined in detail (including also some financial information) together with the services and the targeted stakeholders.By the end of the project, we consider that the main results of EUXDAT are:• A platform for end users with a prototyping environment and all the documentation and tools they need to build their own applications for the agriculture domain;• An orchestration mechanism that not only allows to execute applications in a hybrid environment (mixing HPC and Cloud resources), but also with integrated support for easy data movement;• A set of ready to use scenarios with innovative algorithms that support an optimal management of crops in the agriculture domain, even in some cases making use of HPC resources. Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far) EUXDAT platform offers a thematic cloud platform in the domain of sustainable development with unique features.EUXDAT is based on a performant e-Infrastructure enabling the processing of large amount of data, on an infrastructure involving both HPC and Cloud, with an orchestration mechanism allowing an hybrid execution according to processing characteristics. EUXDAT platform already offers a set of data access connectors to a large number of datasets: - Copernicus data (through Mundi web services platform), - Open land use map, - DEM, - Hydrology data and more generally OpenStreetMap datasets, - LPIS (Land Parcel Identification System), - Climate data (Copernicus CDS)- Access to Soil maps datasets- Private data upload- Meteorological data (through Meteoblue), - field sensor data (via Pessl instrument API).- UAV data (from CERTH)EUXDAT platform offers the capability to simplify the building of a new application in the domain of Sustainable Development from a set of existing unitary functionalities, such as the data access connectors and the already implemented scenarios, offering:- an integrated prototyping environment for the application development initial steps (Jupyter notebook)- a scalable and evolutive e-Infrastructure enabling the deployment of processing services on both HPC and Cloud- a large number of datasets and the capacity to add new specific data access connectors- the capacity to deploy a specific Front-end for an application.- support for the transformation of a finalized prototype into a deployed application through integration process and automated deployment pipelines- use results in EU Stargate projectThe EUXDAT platform can position itself on top of generic EO and data valorisation platforms (such as the DIAS platforms), offering specific functionalities for thematic applications developers. Beside the business exploitation path, hoping to reach sustainability by selling the EUXDAT services to customers (I.e. application developers), EUXDAT can be a very good basis for future project in the sustainable development and agriculture domain, such as projects built in the Green Deal program. Reusing the EUXDAT platform assets would minimize both cost and time to setup a technical basis on which new use cases could be developed.