In this second reporting period a pilot ESCAPE Data Lake was designed, deployed and successfully assessed through a joint exercise by the partner ESFRI RIs. Such an exercise was performed along a full processing chain: (i) from data recording from the telescopes/detectors/sensors to data browsing and access by users; (ii) consequently data were challenged through several RI-designed production-like activities covering processing workflows and data analysis pipelines. Some end-to-end workflows have been running integrally in the Data Lake, that, although originally developed for particle physics, is now expanded and adapted to astroparticle physics, astronomy and cosmology communities thanks to ESCAPE. A broader community of potential users has been engaged through integration of the Data Lake with higher level services to hide the complexity, and giving a simple user view of the service.. AAI services and HPC interfaces within the Data Lake in collaboration with the FENIX/HPC project were successfully tested.
ESCAPE beneficiaries contributed to the development, benchmarking and deployment of software. Gathering common practices and know-how towards the definition of best community approaches for FAIRness of software was a highly relevant step forward for the benefit of the EOSC architecture. Partners contributed to and developed a series of software and services, enriching the ESCAPE catalogue content, including innovation, mostly in the domains of machine learning methods and high-performance programming. All developments are openly available and an example project exists, showing the full possibilities of the system, from best practices in software development (e.g. license and metadata), over continuous integration to test and upload the project to the catalogue, that is linked to the EOSC core services.
The integration of the Virtual Observatory (VO) in the current EOSC infrastructure has progressed significantly during the reporting period. The VO community of data providers and consumers has analysed largely connection points, requirements and challenges for such a purpose. Major progress was achieved towards interoperability standards of tool definition for the multi-messengers investigation with the future advent of the ESFRI projects currently in construction.
The main progress accomplished along the plan for implementing the analysis platform concerned the “integration” activity. Namely,enabling the prototype science analysis platform to interface to most of the other components of the ESCAPE cell: (i) “Data-Lake-as-a-Service” (DLaaS) project; (ii) enabling attached storage in analysis environments of any user, for making large data products available to the compute resource; (iii) VOSpace storage and query services; (iv) analysis software catalogue access and deployment; (v) user analysis environment services.
ESCAPE has been working to integrate into its analysis platform the Zooniverse citizen science platform, with the aim of addressing requirements from the largest possible categories of users. A recent extension of our tasks has been to incorporate engagement not just with the ESCAPE science community, but with the wider EOSC-Future communities.
Finally we acknowledge the active participation of the ESCAPE coordinator to the set-up of the EOSC Future work programme as well as to establish a dedicated consortium. The coordination action with the other science clusters (EOSC-Life, SSHOC, ENVRI-FAIR and PANOSC) as well as with other pan-European e-infrastructures has been intense and successful. We acknowledge a large number of workshops, co-edition of position documents and meetings with the EOSC Association Board, ESFRI Board and EC.