Integrated human data repositories for infectious disease-related international cohorts to foster personalized medicine approaches to infectious disease research

Reporting period: 2019-01-01 to 2020-06-30

Reporting period: 2019-01-01 to 2020-06-30

ReCoDID is a multidisciplinary team of researchers from the global infectious disease arena, who will leverage existing infrastructures and partnerships to develop a sustainable model for the storage, curation and analyses of the complex data sets collected from infectious diseases-related cohorts. The group is particularly experienced in clinical, epidemiological and laboratory research related to emerging infectious diseases of international concern.
While infectious-disease related cohorts collect both data from participant interviews, clinical assessment, and related geospatial, social, and environmental exposures (hereafter named clinic- epidemiological data, CE data), and high-dimensional data from advanced laboratory analysis on clinical samples (HDL data or OMICS data), a system is needed to facilitate the synthesis and analysis of these data, which are typically stored in separate repositories, within and across cohorts.
Combining data repositories across cohorts for infectious disease research is rare. However, significant investments have been made in sharing CE and HDL data from population-based registries in high- income countries to improve personalised medicine, especially in the fields of chronic and rare diseases.
The overarching goal of this project is to develop an integrated, sustainable platform for the sharing, synthesis, and analysis of both CE and HDL data from infectious disease cohorts in keeping with the principles of FAIR (Findable, Accessible, Interoperable, and Reusable) data-driven research. A sustainable platform means that the platform is user-friendly, equitable, searchable and sharable, with control over data uses and permission for the original provider. The repository will be based on a federated model where a tiered permission system and cohort-specific hubs facilitate cohorts’ analyses of their own data and cross-cohort analyses of synthesised data within a clearly elaborated legal, ethical, and equitable framework for cross-study data sharing.
The work in RECODID during the first period has progressed very well in the first year and was slowed down in the second year due to the COVID-19 pandemic. We were still able to have our first annual meeting in May 2020, which was held virtually instead of in-person in Colombia. Success stories during the first reporting period include very lively exchange between the EU-CAN funded ‘sister’ projects with the establishment of cross-consortial working groups (WG ‘Harmonization’ led by the PI of RECODID) and cross-consortial newsletters. RECODID was one of the few projects selected for top-up funding by the European Commission because of its unique focus on data sharing and harmonization of infectious-disease related cohorts. In the new work package ‘COVID-19 Research Response’, RECODID is going to focus on the harmonization of COVID-19 cohorts from Europe and elsewhere, applying the same methodology that was established for the Arbovirus Cohorts before. RECODID is very well linked with other projects working in the same area (for example the Infectious Disease Data Observatory out of Oxford University), capitalizing on synergies and avoiding duplication of work.
The first year was characterized by the implementation of the work flows for data harmonisation and platform pipeline building. Some of the regulatory/ethics work in WP1/2 was delayed by the COVID-19 pandemic starting at around month 14. A survey on the perceived benefits and risks of data sharing was carried out and results will be analyzed during the second period. In WP3, the data harmonization for arbovirus cohort data was the focus of the activities. A master dictionary of variables was generated and the partners agreed on an overarching research question to unify the approach. Partners in WP3 also worked on the biostatistical methods for pooling of heterogenous data sets, particularly with regard due to measurement error and causal inference in pooled cohorts. Several manuscripts were published or are under preparation. One manuscript is available as a preprint (Campbell et al. 2020, see task 3.3 in the scientific report), and a systematic review was accepted by Prospero (see task 3.4 in the scientific report). One manuscript was published (Jong et al., Research Synthesis Methods, 2020).
Workflows in WP4 have been altered by the COVID-19 pandemic, resulting in some delay, but also in the fact that new synergies emerged between ongoing projects regarding the development of the cohort browser and the data hub infrastructure. This has resulted in 4 new public analytical workflows that have been fully integrated into the data hubs system demonstrating the relative simplicity, flexibility and ease in integration of pipelines.
The disruption of work flows because of the COVID-19 pandemic reaches across all work packages. In WP5, the work on local governance had to be paused completely for some months as the work is carried out in close collaboration with the partner in Colombia and Colombia was hit hard by the COVID-19 pandemic, including lockdown regulations. The stakeholder meetings planned within WP6 could not be executed and will be postponed. However, we were able to hold our annual meeting, which was planned for May 2020 in Colombia, as a virtual meeting online, with input from all partners. The meeting was successful and showed us that we could achieve organizing an effective meeting in these exceptional circumstances.
Work in WP7 was successful as we managed to have a functioning website early in the course of the project and continue to have very fruitful and frequent interactions with the other EU-CAN funded consortia. Among the three cross-consortial working groups established (Ethical/legal/social; communication, harmonization of data) the working group on ‚harmonization of data‘ was initiated by the PI of ReCoDID and has since attracted considerable attention among the EU-CAN researchers.
The work in ReCoDID was slightly delayed by the COVID-19 pandemic, but at the same time the project aims prove to be more relevant than ever, which is reflected in the fact that the project was selected for additional funding, adding a whole work package on „COVID-19 research response“ (WP8).
ReCoDID was very active within the EU-CAN cross-consortial activities where several cross-consortional working groups were created, among them one on ‚data harmonization ‘.
Within the ethical/legal work pacakge (WP2), the group decided to embark on additional work beyond the initial aims of the project, and carry out cognitive interviews about the participants‘ perception of broad informed consent in research.
Furthermore, the project is on track following a broader vision of integrated shared data repositories and virtual federated repositories for biological samples – a topic that has gained substantially more traction since the time the projects were evaluated and selected. In WP4, additional synergies between different initiatives promoting virtual federated biorepositories were identified. We are now working with Gates- and UNICEF funded projects around virtual federated biorepositories after this topic was discussed at the American Society for Tropical Medicine meeting in Washington in 2019, in a symposium organized by FIND (Foundation for Innovative Diagnostics, Geneva), WHO, NIH, and the representatives of ReCoDID.