Skip to main content

Allele-specific Deconvolution of Tumour DNA Methylation and Expression Data toReveal Underlying Cell Populations

Periodic Reporting for period 1 - DECODE (Allele-specific Deconvolution of Tumour DNA Methylation and Expression Data toReveal Underlying Cell Populations)

Reporting period: 2016-08-01 to 2018-07-31

Owing to recent advances in DNA sequencing – the technology allowing us to read the genetic code – we are beginning to understand the mechanisms underlying cancer development and evolution. Tumours however are heterogeneous, and sequencing bulk samples reveals a profile that is “averaged” across admixed normal cells and different tumour cell subpopulations, hampering interpretation of the data. While single-cell sequencing can remedy these problems, it is experimentally involved and bulk sequencing will likely remain the standard for the foreseeable future.
Therefore, in this project, I have developed and applied computational methods that disentangle bulk tumour sequencing data to reveal the distinct profiles of the underlying normal and tumour cells. Validation has come from teasing apart computationally mixed pure samples as well as from in-house single-cell sequencing projects. The disentangled profiles provide an enhanced picture of the molecular changes present in cancer cells. Application of our deconvolution methods therefore allows us to optimally mine the wealth of tumour sequencing data flowing from large international consortia. These results are informing the development of personalised treatment of cancer patients in the future.
Development of bulk tumour deconvolution methods:

I have implemented an algorithm to split bulk tumour gene expression into their separate contributions from the underlying normal and cancer cells. This innovation has sparked an ongoing collaboration with the lab of dr. Mansour (UCL Cancer Institute, London, UK) to identify novel cancer genes in acute leukaemia and with the lab of dr. Murchison (Department of Veterinary Medicine, University of Cambridge, UK) to study tumour-host interactions and immune escape in transmissible cancers of dogs and Tasmanian devils.
Together with my PhD student Elizabeth Larose-Cadieux, I have also developed a method to deconvolute bulk tumour DNA methylation data, an important factor in gene regulation. Our method is the first to accurately detect specific genetic variants in this type of data and quantify their allele frequencies to derive the amount of contaminating normal cells and the number of chromosome copies in the cancer cells. In the final step, these confounders are corrected for, yielding the pure tumour methylation profile.

Analysis of single-cell sequencing data:

Single-cell micro-metastases of tumours often occur in the bone marrow. These disseminated tumour cells (DTCs) can lay undetected and dormant for many years, often resisting therapy, before re-activating and giving rise to a new tumour. The nature of DTCs remains elusive, as well as when and from where in the tumour they originate.
In collaboration with scientists in Norway, Belgium, the US and the UK, we sequenced the genomes of 63 single cells isolated from bone marrow donated by six patients diagnosed with localised breast cancer. Comparing the genetic changes in these cells with those identified in the deconvoluted primary tumours we could reconstruct the order of mutations during tumour evolution. In turn this allowed us to trace the origins of the DTC to subpopulations of cancer cells in the primary tumour.

Application to large pan-cancer cohorts – Homozygous deletions and tumour suppressors

Normal human cells have two copies of all genes. Some of these genes are so-called tumour suppressors that can stop cancer developing. Only when both copies of a tumour suppressor are gone – known as a homozygous deletion – do cells progress to cancer. However distinguishing whether a tumour has lost a single or both copies is difficult when samples contain an unknown fraction of normal cells.
We therefore deconvoluted the genomic data of more than 2,200 tumour samples, determining the fraction of normal cells and the number of copies of each gene in the cancer cells. The analysis revealed 96 regions of the genome that are frequently lost during tumour development (see Figure). While some of these are prone to DNA breakage without contributing to tumour development, others contained tumour suppressor genes. Developing a statistical model that uses the footprint of the DNA losses to distinguish between the two, we confirmed 16 established tumour suppressors and proposed 27 previously unknown ones.

Application to large pan-cancer cohorts – Clustered mutational processes

Some processes can generate multiple, typically clustered mutations in a single catastrophic event, leading to substantial reconfiguration of the cancer genome. Three such processes have been described: (i) chromothripsis, in which one or a few chromosomes are shattered and the resulting fragments are stitched together at random; (ii) chromoplexy, in which repair of co-occurring DNA breaks, typically on different chromosomes, results in shuffled chains of rearrangements; and (iii) kataegis, a hypermutation process leading to clustered single base changes on a single DNA strand. We characterised and timed these three processes in the massive Pan-Cancer Analysis of Whole Genomes dataset of 2,658 cancer whole genome sequences.
We found kataegis to be widespread, affecting over 60% of tumours. It appears to activate late during tumour evolution and
In general, the work performed during this action pushes the boundaries of our knowledge in tumour development and evolution. It provides analysis methods and tumour annotations for the wider research community to leverage and build on. Results have been communicated on various occasions to the scientific community, clinical practitioners as well as the broad public.
More specifically, the work on DTCs reveals that only a subset of cells previously believed to be cancer cells have really spread from the patient’s tumour. DTCs are genetically very similar to the original tumour and arise late during tumour evolution. Taken together, the findings help determine the right therapy and suggest there is a longer window than previously thought for cancer to be diagnosed and treated before it spreads.
Likewise, the tools developed for deconvolution and analysis of homozygous deletions provide a powerful orthogonal way of identifying tumour suppressor genes. This is important because knowing these cancer genes allows development of targeted drugs that will work best for the patient.
Close collaboration with clinician scientists has allowed for some of our findings to readily impact clinical practice. Identification of FOS/FOSB rearrangements as a defining feature of benign bone tumours provides a clear diagnostic measure that is already being used to inform treatment.
Lastly, by consolidating existing collaborations and spawning novel ones, this action has strengthened European scientific excellence and competitiveness in (computational) cancer research.
Regions of the genome frequently lost during tumour evolution