Connect and align ELIXIR Nodes to deliver sustainable FAIR life-science data management services

Reporting period: 2020-02-01 to 2021-09-30

The ELIXIR-CONVERGE project will ensure the provisioning, across Europe, of distributed local support to life science researchers for data management that is based on a common ELIXIR-wide toolkit that enables lifecycle management for research data according to international standards. The project also aims to spread excellence by targeting data management training and outreach to prospective ELIXIR member countries in the ERA.

The long-term operations of national nodes will be strengthened by the development and introduction of business models, impact assessment and funding strategies for provisioning of research data management support and services. By driving the alignment of national operations ELIXIR will catalyse a transformation of national and transnational data management and position Europe as global leader for federated research data.

LIXIR-CONVERGE is designed to deliver the major, complex and outstanding issues of building and harmonising national data management practices, access to human capital, impact assessments and national roadmap positioning. ELIXIR-CONVERGE will harmonise toolkits, operations and monitoring indicators. Specifically we aim to achieve this sustained national capacity for life science research data management by delivering results against four project objectives
KR1.1. Established European expert network of data stewards that connect national data centres and similar infrastructures and drive development of interoperable solutions following international best practice, including national interpretations of the General Data Protection Regulation (GDPR)
- A network of life science data management experts with representation from all ELIXIR nodes has been established. (M1.1)
- The network has established a first version of international data management best practice guidelines targeted to the life sciences. These cover the individual steps of the research data life cycle and data management problem areas in several, exemplar research domains and provide a broad foundation for development of further domain specific recommendations. (D1.1)

KR2.1. A comprehensive ELIXIR Training and Capacity building programme in Data Management, directed at both data managers and ELIXIR users, and connected to national training programmes in Data Management in the ELIXIR Nodes and prospective ELIXIR Member countries.
- Established the CONVERGE Training Inventory and Roadmap and a strategy to develop ELIXIR CONVERGE Data Management/Data Stewardship Course (DM/DS) Portfolio and to expand the capacity with and within the Nodes, including building block pieces of the learning path for data managers, data stewards and data researchers
- Developed and delivered 36 DM/DS training events to 1,121 participants by 13 ELIXIR Nodes till September 2021.
- Performed a training gap analysis (D2.1) and identified priority topics, for which targeted ELIXIR Training Materials will be generated.
- Linked training materials in TeSS, the ELIXIR Training Portal, to the RDMKit (WP3 & WP2 ) and also created training material for RDMKit (WP3).

KR2.2. Development of a collective group of trainers that support scalable deployment of Data Management training across ELIXIR Nodes.
- Work has started on developing train-the-trainer materials for CONVERGE DM/DS train-the-trainer courses, which will become part of the ELIXIR CONVERGE DM/DS Course Portfolio.
- Started to establish a network of data management trainers
KR2.3. A substantial cohort of data managers, Node coordinators and researchers with specific data management skills, business planning and knowledge of transnational operations across the ELIXIR Nodes.
- In the period May 2020 – Sept 2021 36 data management and data stewardship courses have been developed and delivered by 13 Nodes (Task 2.2 & T 2.3).
- The established network of data management experts has over 100 members, with a core of 37 Data Management Coordinators from all ELIXIR Nodes (M1.1).

KR3.1. Assemble a full-stack harmonised common toolkit comprising all aspects of data management: from data capture, annotation, and sharing; to integration with analysis platforms and making the data publicly available according to international standards. A common toolkit, RDMkit, was delivered (D3.1) covering all aspects of the data life cycle : from planning and collecting the data, over processing and analysing to preservation, sharing and reuse. The toolkit has a well-developed and scalable open editorial and contribution process with +110 contributors to date.

KR3.2. Provide exemplar toolkit configurations for prioritised demonstrators to serve as templates for future use.
The RDMkit has community contributed domain pages for demonstrators and beyond : plant sciences, marine metagenomics, human data, biomolecular simulation data, intrinsically disordered proteins, microbial biotechnology, epitranscriptome data, proteomics, toxicology data and bioimaging data. Tool Assemblies provide exemplar toolkit configurations, cross-linked to domain pages. Tool Assemblies currently focus on human data (clinical and translational data, COVID-19 data), bioimaging, plant sciences, marine metagenomics and general data management.

KR3.3. Establish national capacity in using as well as updating, extending and sustaining the toolkit across the ERA.
RDMkit was established (D3.1) and disseminated through hackathons (M3.2 M3.2.1 and M3.2.2) webinars and presentations at national and international meetings, including invited presentations by the NIH Office of Data Strategy and the EC Open Science Unit . Methodologies to extend the toolkit have been established (D3.2). Outreach materials are available online. Tool assemblies are available for Norway, Finland and France, details on national resources are under construction. The RDMkit is recommended as a resource by the ERC.

KR4.1. Development of a Node Impact Assessment Toolkit based on RI-PATHS methodology
- The main components of the toolkit are being developed and/or maintained, and are aligned with the RI-PATHS methodology.
KR4.2. Adoption of Impact assessment in ELIXIR Nodes, supported by Node coordinators network and feedback on applicability from dialogues with national funders
- Work is ongoing to develop Node capacity in impact evaluation through targeted actions (e.g. workshops, bi-monthly “show and tell”), notably through a set of impact challenges led by, and relevant to, Nodes.
KR4.3. Creation of national public-private partnerships and industry outreach where open life-science data and services stimulate local bioeconomy
- ELIXIR Nodes are building their capacities in areas such as mapping of the local ecosystem and effective communication, thereby becoming better equipped to efficiently engage with the private sector at the local and national levels.
The ELIXIR Research Data Management Kit (RDMkit) is an online guide containing good data management practices applicable to research projects from the beginning to the end. Developed and managed by people who work every day with life science data, the RDMkit has guidelines, information, and pointers to help you with problems throughout the data’s life cycle.

