The traditional model of scientific research involves teams working separately, using their own resources and sharing data only via research papers. Weaknesses of this model can include lengthy delays in the availability of information. Additionally, the bulk of data, and also the analysis software, are usually not publicly available at all. Such factors affect repeatability, which partly determines the robustness of scientific conclusions. A single platform for scientific data storage and software tools would solve these problems. Although scientific e-infrastructure options already exist, they are limited. Most are narrowly focused, serving only a single research field or national community. Current solutions are also inadequate to handle the immense quantities of data involved in scientific applications. The EU-funded EOSCpilot project examined the data storage needs of scientific research organisations and proposed a draft solution. The project’s name refers to it being a pilot study for the European Open Science Cloud programme. EOSC is conceptually similar to consumer-level cloud-based sharing, but vastly larger, more complex and more open.
“The ambition for EOSC is enormous,” says Dr Juan Bicarregui, EOSCpilot project coordinator and Head of Data at the Science and Technology Facililties Council. “It will support all of European research, presently 1.7 million researchers, by providing free, open services for data storage, management, analysis and reuse across disciplines.” EOSC will also enable the sharing of software, various other tools, instruments and methodologies. Everything necessary for research will eventually be shared, including training materials. Rather than building a monolithic new system, EOSC will bridge existing data structures. “Users may eventually use the EOSC system without realising they are doing so,” says Dr Bicarregui, “accessing the larger infrastructure through their current domains. We also looked at the architecture of EOSC, considering how interoperability could be achieved, both for data and for technology.”
Case study evaluations
EOSCpilot was not intended to build the EOSC system, but rather to document and evaluate its requirements. In making the evaluations, the project team examined numerous case studies, called science demonstrators, spanning diverse disciplines. A prototype EOSC portal has been running since November 2018. However, this is only a first step. The portal will be developed over many years. The next stage will be for other projects in the INFRAEOSC and Open Science Cloud programmes to build the e-infrastructure EOSCpilot has defined. EOSCpilot researchers additionally considered how EOSC should be governed. The team proposed a framework, which EOSC has adopted until 2020. After that, a future model for governance will be developed. Dr Bicarregui explains that one remaining challenge concerns changes to the culture of science. Currently, researchers gain recognition only through their publications. Researchers need incentives to share data and other outputs. EOSC will provide the technology to support open science, and may be able to foster the cultural changes as well. Apart from achieving its targets, EOSCpilot also provided a common ground that enabled its many partner organisations to work together effectively. The project has moved EOSC from an ambitious vision to something concrete.
EOSCpilot, EOSC, data, science, cloud, data storage, European Open Science Cloud, science demonstrators