Periodic Reporting for period 1 - RECODID (Integrated human data repositories for infectious disease-related international cohorts to foster personalized medicine approaches to infectious disease research)
Reporting period: 2019-01-01 to 2020-06-30
While infectious-disease related cohorts collect both data from participant interviews, clinical assessment, and related geospatial, social, and environmental exposures (hereafter named clinic- epidemiological data, CE data), and high-dimensional data from advanced laboratory analysis on clinical samples (HDL data or OMICS data), a system is needed to facilitate the synthesis and analysis of these data, which are typically stored in separate repositories, within and across cohorts.
Combining data repositories across cohorts for infectious disease research is rare. However, significant investments have been made in sharing CE and HDL data from population-based registries in high- income countries to improve personalised medicine, especially in the fields of chronic and rare diseases.
The overarching goal of this project is to develop an integrated, sustainable platform for the sharing, synthesis, and analysis of both CE and HDL data from infectious disease cohorts in keeping with the principles of FAIR (Findable, Accessible, Interoperable, and Reusable) data-driven research. A sustainable platform means that the platform is user-friendly, equitable, searchable and sharable, with control over data uses and permission for the original provider. The repository will be based on a federated model where a tiered permission system and cohort-specific hubs facilitate cohorts’ analyses of their own data and cross-cohort analyses of synthesised data within a clearly elaborated legal, ethical, and equitable framework for cross-study data sharing.
The first year was characterized by the implementation of the work flows for data harmonisation and platform pipeline building. Some of the regulatory/ethics work in WP1/2 was delayed by the COVID-19 pandemic starting at around month 14. A survey on the perceived benefits and risks of data sharing was carried out and results will be analyzed during the second period. In WP3, the data harmonization for arbovirus cohort data was the focus of the activities. A master dictionary of variables was generated and the partners agreed on an overarching research question to unify the approach. Partners in WP3 also worked on the biostatistical methods for pooling of heterogenous data sets, particularly with regard due to measurement error and causal inference in pooled cohorts. Several manuscripts were published or are under preparation. One manuscript is available as a preprint (Campbell et al. 2020, see task 3.3 in the scientific report), and a systematic review was accepted by Prospero (see task 3.4 in the scientific report). One manuscript was published (Jong et al., Research Synthesis Methods, 2020).
Workflows in WP4 have been altered by the COVID-19 pandemic, resulting in some delay, but also in the fact that new synergies emerged between ongoing projects regarding the development of the cohort browser and the data hub infrastructure. This has resulted in 4 new public analytical workflows that have been fully integrated into the data hubs system demonstrating the relative simplicity, flexibility and ease in integration of pipelines.
The disruption of work flows because of the COVID-19 pandemic reaches across all work packages. In WP5, the work on local governance had to be paused completely for some months as the work is carried out in close collaboration with the partner in Colombia and Colombia was hit hard by the COVID-19 pandemic, including lockdown regulations. The stakeholder meetings planned within WP6 could not be executed and will be postponed. However, we were able to hold our annual meeting, which was planned for May 2020 in Colombia, as a virtual meeting online, with input from all partners. The meeting was successful and showed us that we could achieve organizing an effective meeting in these exceptional circumstances.
Work in WP7 was successful as we managed to have a functioning website early in the course of the project and continue to have very fruitful and frequent interactions with the other EU-CAN funded consortia. Among the three cross-consortial working groups established (Ethical/legal/social; communication, harmonization of data) the working group on ‚harmonization of data‘ was initiated by the PI of ReCoDID and has since attracted considerable attention among the EU-CAN researchers.
ReCoDID was very active within the EU-CAN cross-consortial activities where several cross-consortional working groups were created, among them one on ‚data harmonization ‘.
Within the ethical/legal work pacakge (WP2), the group decided to embark on additional work beyond the initial aims of the project, and carry out cognitive interviews about the participants‘ perception of broad informed consent in research.
Furthermore, the project is on track following a broader vision of integrated shared data repositories and virtual federated repositories for biological samples – a topic that has gained substantially more traction since the time the projects were evaluated and selected. In WP4, additional synergies between different initiatives promoting virtual federated biorepositories were identified. We are now working with Gates- and UNICEF funded projects around virtual federated biorepositories after this topic was discussed at the American Society for Tropical Medicine meeting in Washington in 2019, in a symposium organized by FIND (Foundation for Innovative Diagnostics, Geneva), WHO, NIH, and the representatives of ReCoDID.