Skip to main content

Photon and Neutron Open Science Cloud

Periodic Reporting for period 1 - PaNOSC (Photon and Neutron Open Science Cloud)

Reporting period: 2018-12-01 to 2020-05-31

The Photon and Neutron Open Science Cloud (PaNOSC) is a joint undertaking of photon and neutron sources on the ESFRI roadmap to make data preservation, sharing and re-use a reality and to link to the European Open Science Cloud (EOSC). Data management according to the FAIR principles will benefit both the scientific community at large by making petabytes of data accessible and re-usable, and the research institutes and their users by providing services to reduce, analyse and publish their data.

PaNOSC is one of five science cluster projects supported by the INFRAEOSC-04 call. The partners joining their forces in the PaNOSC project are ESRF (synchrotron), ILL (neutron source), EuXFEL (free electron laser), ELI-DC (multi-site optical laser sources), ESS (neutron source) together with the ERIC CERIC-ERIC (3 photon and neutron sources), and e-infrastructures EGI (partner) and GÉANT (contributor). The participating RIs have very different levels of maturity, ranging from operational and currently upgraded facilities (ESRF, ILL), recently put into operation (CERIC-ERIC, EuXFEL), just starting up (ELI), to still under construction (ESS). PaNOSC will share best practices and expertise to bring all partner sites rapidly up to the same level of FAIR data management.

The five key objectives of the project are as follows:

1. Link the participating Research Infrastructures (RIs) to the EOSC by exposing data, providing services, and promoting its use by the scientific community. Technically this involves persistent identity management, access to compute and storage, data transfer and archival. Organisationally it implies a common approach between research infrastructures and provision of an efficient and competent user service.
2. Make scientific data produced at Europe’s major Photon and Neutron sources fully compatible with the FAIR principles by adopting a harmonised data policy and by adding rich and meaningful metadata to the experimental data generated in the RIs. All publicly funded data generated in the RIs will be made open (after an embargo period) and downloadable in accordance with the data policies adopted at the facilities.
3. Provide innovative data services to the users of these facilities locally and to the scientific community at large with the EOSC. Remote data reduction and analysis services will be implemented to help scientists interacting with data sets of variable size.
4. Increase the impact of RIs by ensuring data from user experiments can be used beyond the initial scope. Exposing data to the EOSC will allow combining data sets from different laboratories, cross domain and cross disciplinary.
5. Share the outcomes with the national RIs who are observers in the proposal and the scientific community to promote the adoption of FAIR data principles, data stewardship and uptake of the EOSC. The outcome of the work undertaken in PaNOSC will be shared with and promoted in the entire photon and neutron community and beyond.
The project has progressed as foreseen by its workplan. The results include shared practices for data management, an updated data policy, a search API for open data, metadata harvesting and registering these with OpenAIRE and re3data, Jupyter notebook services, simulation software, and active contributions to the HDF5 and Jupyter ecosystems.
The release of a new FAIR data policy for PaN sources has generated significant interest on Zenodo, and discussions will now start for its implementation at the PaNOSC partner RIs and its promotion within the community. The metadata harvesting has been enabled by implementing OAI-PMH at all sites and registering these with the EOSC projects OpenAIRE and re3data to make the data findable. An electronic logbook developed at the ESRF shows the way how to enhance metadata with a rich description of the experiment.
The Jupyter notebook services have been set up at all partner sites, including EGI. In the future Jupyter services should be standard services offered by the EOSC with support for different hardware resources, including CPUs and GPUs.
Site data portals are being installed and improved at partner sites and allow access to open data and analysis services. Work has started on an HDF5 web viewer. This viewer will have an impact for the wider scientific community once it is integrated in the Jupyter ecosystem.
The ray-tracing simulation software OASYS has been further improved and extended and is now the de facto solution for designing new beamlines at photon sources world-wide.
One of the main objectives of PaNOSC is to connect the Photon and Neutron RIs to the EOSC. A first step consisted in implementing an AAI (Authentication-Authorisation-Infrastructure) with GEANT in such a way as to be compatible with the future EOSC AAI. The same is underway for data transfer where Globus-online and OneData are being validated. However it is not clear which technology will be part of the EOSC core services.
The pan-learning.org training platofrm has been installed at ESS and a number of training activities have been carried out using the pan-learning.org platform.

The COVID-19 virus has forced the PaNOSC partners already in operation to propose remote services for all their facilities, including remote experiment control. This has made the PaNOSC outcomes especially important and increased the priority and interest in the outcomes of PaNOSC and the EOSC at all sites.
Synchrotrons, Free Electron Lasers (FELs) , Optical Lasers and Neutron Sources play a crucial role for research on some of the most significant fundamental scientific and societal challenges. The scientific productivity of our facilities will be enhanced through interoperable open data in compliance with the FAIR principles. The current health crisis has highlighted the utmost importance of remote access to facilities and FAIR data management. Rapid public availability of COVID-19 research data is crucial to advance the understanding of the functioning of the coronavirus and in finding pathways in treating patients and stemming the spread of the virus. The PaNOSC project is timely and has allowed to rapidly make an open science data analysis portal based on Jupyter notebooks available. A number of the participating facilities are in the process of making data sets relevant to COVID-19 research available based on FAIR principles.
Most of the ESFRI roadmap projects, and similarly all national RIs, are facing an exponential growth in research data. PaNOSC is a stepping stone for harnessing the data avalanche and for ensuring that the scientific productivity can be kept up or increased and that access to data, software, and computing resources through the brokering of the EOSC ecosystem will be seamless. However, this is not only a technical challenge. In order to become a reality requires an intensive coordination effort with the other science cluster projects (EOSC-LIFE, ENVRI-FAIR, ESCAPE and SSHOC) and the on-going and future INFRAEOSC projects. In this context PaNOSC will allow to acquire experience with remote analysis services and prepare the ground for a pan-European support infrastructure to help scientists at large to interact with FAIR data.
Rollup poster
PaNOSC Objectives
PaNOSC presentation at the CNRS EOSC event
Interview with Prof. Fangohr on data analysis services and citizen science for COVID19
PaNOSC+ExPaNDS combined 2nd Annual Meeting announcement
PaNOSC 1st Annual MeetinggGroup photo
A further step towards EOSC sharing research data in ultramarine blue disease in masterpieces
Rollup poster