Skip to main content

EUCANCan: a federated network of aligned and interoperable infrastructures for the homogeneous analysis, management and sharing of genomic oncology data for Personalized Medicine.

Periodic Reporting for period 1 - EUCANCan (EUCANCan: a federated network of aligned and interoperable infrastructures for the homogeneous analysis, management and sharing of genomic oncology data for Personalized Medicine.)

Reporting period: 2019-01-01 to 2020-06-30

In brief, EUCANCan’s ultimate aim is the implementation of the first operational, federated and interoperable infrastructure for the secure sharing of genomic and clinical cancer data within and across Canada and Europe. This infrastructure will connect several cancer research centers and should allow authorised researchers of these institutions to search, select and securely download cancer genomic datasets of interest from other centers of the same network. This infrastructure is therefore designed as an assembly of different elements/pillars, each covering specific and key aspects of data sharing: technical, methodological, and ethico-legal. Whereas most of the world-wide initiatives pushing data sharing in biomedicine (e.g. GA4GH, ELIXIR) are concentrated on developing optimal components of these infrastructures, EUCANCan is mainly focused on connecting the best available pieces/components and building an operational infrastructure, facing current challenges related to community-community interactions and interoperability within federated data environments.

The generation of EUCANCAn insfrastructure is expected to largely benefit data-driven research and biomedicine, by giving the community the opportunity of extracting and using health data more effectively, to ultimately find global and improved protocols for diagnosis and treatment within Genomic Oncology and Personalized Medicine.
Technical and research WPs (from 2 to 6) cover all necessary aspects that allow the sharing and effective comparison of different datasets within genomic oncology: (a) high quality and homogeneous genome analysis protocols to allow direct and easy comparison and aggregation of processed datasets coming from different studies (WP2); (b) a computational frame to support the local management and the sharing of genomic and clinical data with standardized formats (WP4 and WP5); (c) an ethic and legal frame that maximizes the possibilities for data sharing, in agreement with European and Canada data protection policies (WP6); and (d) a strategy to explore and provide sustainability and expansion of EUCANCan infrastructures beyond the project through its potential integration within sustainable infrastructures, like EGA and open cloud technologies (WP3). During this first 18 months EUCANCan’s major progress across these areas is highlighted by (1) the assessment of genome analysis variability across sites and the generation of variant calling and corresponding benchmarking resources; (2) the implementation of the first components of the sharing infrastructure across four different sites; (3) the implementation of genome analysis (variant calling) pipelines within OpenCloud environments; and (4) the review and generation of specific legal-oriented guidance for global sample reidentification criteria and for the exchange of health data within Canada and with Europe.

In addition to the specific scientific advances, EUCANCan dissemination activity has also translated in a great impact within the community, pushing the generation of important collaborations to enhance global data sharing further. Beyond many interactions with similar European projects IPC, EUCANShare, CINECA and others, EUCANCan became a driver project within the Global Alliance for Genome and Health (GA4GH). This mutually beneficial interaction will, not only enhance the visibility of EUCANCan within the biomedical community, but also allows us to align and work together with major data sharing initiatives world-wide. Another example is the participation of EUCANCan within the 1 Million project, as a project and also through the direct involvement of some our partners in different work streams of this European initiative. In addition, EUCANCan also teamed up with the European OpenEBench efforts to expand and consolidate our activity in the area of genome analysis by creating useful (and sustainable) benchmarking solutions.
The activity of the past 18 months within EUCANCan, has set the optimal ground for completing our tasks and achieving our initial goals. Whereas this first period has been concentrated in defining and building key technical, methodological and legal parts of the infrastructure, current and upcoming activity within the project will be focusing on connecting these parts and constructing an operational federated system for data sharing in oncology and health. The software developments directed to the technical aspects of the network for data management, search and transfer now allow us to start connecting data sites and start building an effective data sharing environment. On the other side, the advances made in the area of genome analysis optimization and harmonization will culminate during the second part of the project with the generation of robust benchmarking platform for the optimization and assessment of variant identification software used by clinical and research centres. In addition, the Canadian and European efforts towards reviewing and defining the data protection frames at each participant country will also be combined with technical and methodological aspects of the entire infrastructure to ensure compliance with local, national and international data protection laws. Finally, we also emphasize our dissemination efforts that will culminate with several training and scientific events that will take place during the second part of the project, which is expect to increase the visibility, promote our developments and embrace other public and also private initiatives.