Skip to main content

Bioinformatic approaches to identify and detect both disease- and drug-related genomic alterations in breast cancer patients

Periodic Reporting for period 1 - BRIDGES (Bioinformatic approaches to identify and detect both disease- and drug-related genomic alterations in breast cancer patients)

Reporting period: 2015-06-01 to 2017-05-31

With more than 400,000 new cases in 2012, breast cancer is the most common cancer among European women. Present clinical management causes overtreatment in more than 50% of patients, with implications on both patients’ quality of life and healthcare costs sustainability. At the same time, intrinsic or acquired tumour resistance to treatment leads to disease progression towards incurable metastatic disease in a significant proportion of patients.
Advances in cancer genomics highlighted a high inter- and intra-tumour genetic heterogeneity of breast cancer, reinforcing the need for a personalized treatment and a way to non-invasively monitor an evolving disease. Although examples of targeted therapies have been developed in breast cancer (e.g. hormone therapy in estrogen receptor positive tumours or HER2 targeting in HER2 amplified tumours), a still unmet challenge is the implementation of a real personalized treatment and parallel development of companion biomarkers for patients’ stratification and early detection of resistance.
Accordingly, the aims of this project were: 1) to develop new bioinformatics approaches to analyse and exploit large set of genomic and transcriptomic data from clinical specimens, liquid biopsy and pre-clinical models; 2) to identify candidate predictive biomarkers associated with response to treatment and enable their non-invasive assessment in a liquid biopsy.
In the time frame of the action three computational approaches have been developed, respectively able: 1) to identify somatic mutations in cancer with higher sensitivity and specificity, 2) to distinguish human and mouse reads in sequencing data from patient-derived tumour xenografts (PDTX) and 3) to analyse amplicon based sequencing data from FFPE and plasma samples. These approaches are either published or submitted for publication in peer-reviewed international journals.
At the same time, the integrative analysis of genomic and transcriptomic data from a clinical cohort and a PDTX cohort has identified several molecular signatures associated with drug response in breast cancer. Results obtained from the integrative analysis have been presented at the AACR Annual Meeting 2017 and will be submitted for publication in the near future.
Molecular and drug response data from the PDTX cohort have been made available through a user-friendly graphical interface at
Development of new computational approaches
1) We generated a whole exome sequencing benchmark dataset using the Platinum Genome sample NA12878 and developed an Intersect-Then-Combine (ITC) approach to increase the accuracy in calling Single Nucleotide Variants (SNVs) and Indels in tumour-normal pairs. We evaluated the effect of alignment, base quality recalibration, mutation caller and filtering on sensitivity and false positive rate. The ITC approach increased the sensitivity up to 17.1%, without increasing the False Positive Rate per Megabase (FPR/Mb) and its validity was confirmed in a set of clinical samples. The ITC approach has been published (Callari et al. Genome Medicine 2017; PMID:28420412).
2) The molecular characterization of PDTXs using High-Throughput Sequencing (HTS) pose extra challenges caused by the presence of mouse stroma in the sample. Indeed, the high homology between the two genomes results in a proportion of mouse reads being mapped as human. We developed an approach able to discriminate between human and mouse reads with up to 99.9% accuracy and decrease the number of false positive somatic mutations caused by misalignment by >99.9%. In RNA-seq and RRBS data analysis, our approach allows dissecting computationally the transcriptome and methylome of human tumour cells and mouse stroma. A manuscript describing the method is currently under review.
3) A targeted sequencing approach for the mutational profiling of ctDNA was developed, characterised by: i) the use of ultra-low input DNA; ii) high level of multiplexing that enabled the coverage of a large target region; iii) a bespoke computational pipeline for data analysis and iv) competitive costs. A gene panel for serial monitoring of metastatic breast cancer patients was developed. A set of control experiments was generated to develop the protocol, measure the performance of the method and of the associated computational pipeline. A manuscript describing the method is submitted for publication.

Integrative analysis to reveal pharmacogenomics associations in breast cancer
An analysis framework was developed starting from a large breast cancer clinical cohort where, taking advantage of the large samples size, a set of breast cancer specific molecular signatures were derived using two distinct approaches. The sets of molecular signatures identified using the two approaches were then tested for association with drug response in a patient-derived tumour xenografts (PDTXs) cohort, where both molecular data and drug response data were available. A number of significant correlations were found. Validation in independent models and experiments to understand the underlying biological mechanism are ongoing.
In a cohort of plasma samples from metastatic breast cancer patients copy number and mutational profile are being obtained using shallow Whole Genome Sequencing (sWGS), Whole Exome Sequencing (WES) or targeted sequencing, demonstrating the feasibility of tracking in a non-invasive way predictive genomic alterations in breast cancer.
The methodological work performed during this action has led to the development of three computational approaches already published or submitted for publication. Importantly, all benchmark datasets generated were made publicly available through precisionFDA ( or the European Genome-phenome Archive ( a valuable resource for other researchers that want to develop and test alternative computational approaches.
The targeted sequencing approach for the detection of ctDNA was designed to be flexible in terms of choice of targets and regions of interest. Thus, it can be tailored to various cancer types, stages of disease and specific clinical contexts. For these reasons and thanks to its competitive costs, it is expected to be extensively used in translational and clinical research.
The candidate biomarkers identified in this action will be further validated and could reach clinical implementation. Importantly, our Cancer Centre has a Clinical Molecular Diagnostic Laboratory where genetic testing can be performed to prove their clinical utility. Appropriate predictive biomarkers are crucial to make the implementation of precision oncology economically sustainable.
Framework to identify pharmacogenomics associations in breast cancer