Skip to main content

EUCANCan: a federated network of aligned and interoperable infrastructures for the homogeneous analysis, management and sharing of genomic oncology data for Personalized Medicine.

Periodic Reporting for period 2 - EUCANCan (EUCANCan: a federated network of aligned and interoperable infrastructures for the homogeneous analysis, management and sharing of genomic oncology data for Personalized Medicine.)

Reporting period: 2020-07-01 to 2021-12-31

In brief, EUCANCan’s ultimate aim is the implementation of the first operational, federated and interoperable infrastructure for the secure sharing of genomic and clinical cancer data within and across Canada and Europe. This infrastructure will connect several cancer research centers and should allow authorised researchers of these institutions to search, select and securely download cancer genomic datasets of interest from other centers of the same network. This infrastructure is therefore designed as an assembly of different elements/pillars, each covering specific and key aspects of data sharing: technical, methodological, and ethico-legal. Whereas most of the world-wide initiatives pushing data sharing in biomedicine (e.g. GA4GH, ELIXIR) are concentrated on developing optimal components of these infrastructures, EUCANCan is mainly focused on connecting the best available pieces/components and building an operational infrastructure, facing current challenges related to community-community interactions and interoperability within federated data environments.


The generation of EUCANCAn insfrastructure is expected to largely benefit data-driven research and biomedicine, by giving the community the opportunity of extracting and using health data more effectively, to ultimately find global and improved protocols for diagnosis and treatment within Genomic Oncology and Personalized Medicine.
Technical and research WPs (from 2 to 6) have been designed to allow the management, analysis and sharing of genomic oncology data within research data sharing spaces. These cover: (a) high quality and homogeneous genome analysis strategies to allow direct and easy comparison and aggregation of processed datasets coming from different studies (WP2); (b) a computational frame to support the local management and the sharing of standardized genomic and clinical data with standardized formats (WP4 and WP5); (c) an ethico-legal frame that maximizes the possibilities for data sharing, in agreement with European and Canada data protection policies (WP6); and (d) a strategy to explore and provide sustainability and expansion of EUCANCan infrastructures beyond the project through its potential integration within sustainable infrastructures, like EGA and open cloud technologies (WP3).
Within the first 36 months of activity EUCANcan has accomplished important intermediate goals and milestones: (1) Generation of the first version of a Benchmarking platform for somatic variant calling from cancer whole genome sequences; (2) Implementation of Overture (https://www.overture.bio/) in three different nodes, allowing an interoperable management and indexing of genomic and phenoclinical data, as well as their searching through the EUCANCan data portal; (3) the implementation of genome analysis (variant calling) pipelines within OpenCloud environments; and (4) the review and generation of specific legal-oriented guidance for global sample reidentification criteria and for the exchange of health data within Canada and with Europe.
In addition to the specific scientific advances, EUCANCan dissemination activity has also translated in a great impact within the community, pushing the generation of important collaborations to enhance global data sharing further. Beyond many interactions with similar European projects IPC, EUCANShare, CINECA and others, EUCANCan became a driver project within the Global Alliance for Genome and Health (GA4GH). This mutually beneficial interaction will, not only enhance the visibility of EUCANCan within the biomedical community, but also allows us to align and work together with major data sharing initiatives world-wide. Importantly, during this time, EUCANCan has also served as model and guidance for other initiatives, also considering the sharing of cancer data, such as the Beyond 1 Million Genomes (B1MG), were our project is helping pushing the cancer group, as well as the benchmarking efforts of variant calling. In this context, the project is also interacting with OpenEBench to expand and consolidate our activity in the area of genome analysis by creating useful (and sustainable) benchmarking solutions.
The activity of the first reporting period did set the optimal ground for completing the different tasks and achieving our initial goals, by defining and building key technical, methodological and legal parts of the EUCANCan infrastructure. During last second period, we have started to capitalize previous efforts and building the final network by connecting some of the partner centers, as the first nodes of the targeted data sharing environment. Around this federated infrastructure, we are also building the proper analysis frames that should support an efficient sharing of comparable genomic data. In addition, the Canadian and European efforts towards reviewing and defining the data protection frames at each participant country will also be combined with technical and methodological aspects of the entire infrastructure to ensure compliance with local, national and international data protection laws. Finally, during this past period, we have already organized several training and scientific events, which have significantly increased the visibility of the project, as well as the promotion of our developments, embracing other public and also private initiatives
Visual summary of the project