Skip to main content

Mobile elements: shuffling the regulome in development and disease

Final Report Summary - EVOLINE (Mobile elements: shuffling the regulome in development and disease.)

The proposed research aims to investigate the role of mobile genetic elements (MGE) in genetic diversity of the mammalian somatic genome and its role in healthy and pathological conditions. Surprisingly, only ~5% of the human genome is composed of coding sequences (i.e. exons), which are assisted by other regulatory elements to ensure proper coordination in time and space of genomic information. By contrast, about half of the human genome is composed out of repeated sequences, a consequence of MGE activity (Lander et al. 2001). MGEs are sequences with different evolutionary origins and with hybrid features between genes and viruses, which multiply within the genome by different mechanisms (Kazazian and Moran 2017). MGEs are present in all living forms, and have been proved to be a source of genetic mutations, actively contributing to the generation of genetic diversity necessary for evolution.
The mobilization of MGEs underlies insertional mutagenesis, in which mutations consist of insertions of new pieces of DNA within a new genomic location. Frequently, the inserted piece is a MGE copy, and when present in the cellular lineage for reproductive cells (germline), it can be transmitted to the progeny and will become heritable. This is the way MGE copies have accumulated over evolution. Nonetheless, aberrant mobilisation and mutation accumulation overtime have disabled most of the MGE families, leaving only young families with few competent copies (Boissinot et al. 2016). Typically, elements from old MGE families are fixed and present in all human individuals, while elements from young MGE families are often polymorphic, present only in a subgroup of individuals (Ewing and Kazazian 2010), and are subjected to natural selection. Active MGEs are abundant within the last group (Beck et al. 2010). Remarkably, and despite MGEs acting as selfish DNA, recently it was demonstrated that MGEs are active in selected somatic tissues. Indeed, insertions occurring during development outside the germline lineage will lead to somatic genome mosaicism in newborns. Thus, different cells of the same individual will not have exactly the same genome due to somatic MGE insertions.
The phenotype/effect of a MGE insertion will depend on its impact on the regular function of its genomic environment and in the number of cells carrying this insertion. Therefore, germline insertions have more chances to generate a phenotype than somatic insertions. In this way, the first report of a MGE active in humans was discovered in a case of haemophilia derived from an inherited de novo germline insertion in coagulation factor VIII maternal allele (Kazazian et al. 1988). Nonetheless, somatic insertions can have a dramatic phenotypical impact even with limited prevalence. As an example, a case of colorectal cancer was likely triggered by a MGE somatic insertion in the APC gene (Scot et al. 2016). These and other studies have revealed that currently active MGEs in humans are associated with sporadic genetic disease and oncogenesis (Miki et al. 1992; Payer et al. 2017; Faulkner and Garcia-Perez 2017).
Long interspersed element class 1 retrotransposons (LINE1s or L1s) are the only autonomous MGEs currently active in humans. An active L1 contains an internal promoter and encodes its own machinery for mobilisation (called retrotransposition), allowing its own transcription and generation of new copies by reverse transcribing its RNA in a new genomic location (Cost et al. 2002). Due to its deleterious potential, mobilisation is controlled by the host and reduced to tolerable levels. Indeed, a background mobilisation is tolerated in both the germline and somatic healthy tissues like the brain (Ewing and Kazazian et al. 2011; Garcia-Perez et al. 2010; Baillie et al. 2011; Evrony et al. 2012; Upton et al. 2015; Erwin et al. 2016; Macia et al. 2017). The epigenetic silencing of the L1 promoter is one of the main mechanisms controlling L1 activity (Muotri et al. 2010; Yu et al. 2001). A family of zinc finger proteins has evolved to target old and inactive MGE families, silencing them through KAP1-mediated recruitment of histone modifiers for heterochromatin formation (Jacobs et al. 2014; Castro-Diaz et al. 2014). However, the single currently active L1 subfamily in the human genome, termed L1Hs, is not regulated by ZFPs, and their repression appears to be controlled mostly by DNA methylation of its promoter (Muotri et al. 2010; Marchetto et al. 2013). However, the underlining mechanisms responsible to silence young and active L1Hs elements remain unknown.
The EVOLINE project aimed to determine the role of somatic insertions in human healthy and pathological conditions, specially focusing in the brain and cancer respectively, and using mouse and human samples. Recently, the detection of somatic insertions became achievable by several high throughput sequencing approaches, although they provided contradictory levels of mobilisation in the brain (Baillie et al. 2011; Evrony et al. 2012; Upton et al. 2015). To resolve this, within the frame of EVOLINE we designed a conservative strategy to screen for somatic insertions in human brain samples. It consisted of combining three different sequencing techniques in the same samples: bulk tissue and whole genome amplified (WGA) single-cell DNA. The three techniques were: retrotransposition capture-sequencing or RC-seq (Baillie et al. 2011; Shukla et al. 2013; Upton et al. 2015), L1-sequencing (Ewing and Kazazian 2010; Evrony et al. 2012) and direct whole genome sequencing (Evrony et al. 2015). From these experiments, we inferred levels of L1 retrotransposition in human brain closer to previously reported lower estimations, although still subjected to false negative rate of detection. Notably, we also confirmed that L1 mobilisation can occur in precursor cells and, therefore, the same L1 insertion can be shared by several neurons. The combination of WGA and RC-seq for the detection of somatic L1 insertions was published in a methodological book chapter (Sanchez Luque et al. 2017).
After resolving the rate of LINE-1 mobilisation in the brain, we next focused on deciphering how active young L1Hs elements evade repression in somatic cells. Thus, within EVOLINE, we developed a new approach for resolving the DNA methylation status of the promoter of locus-specific L1Hs elements, achieving a higher resolution and coverage over the L1 promoter than any prior studies (Coufal et al. 2019; Scot et al. 2016; Tubio et al. 2014). Notably, this new approach allowed us to demonstrate that there are several pathways by which L1Hs elements can avoid their epigenetic repression, resulting in the accumulation of new L1 insertions in somatic tissues from non-silenced locus-specific L1s. From a mechanistic angle, we identified relevant domains within the L1 promoter that are essential for the yet unknown mechanisms driving DNA methylation. We also found that the silencing pathway we are uncovering here seem conserved in old L1 subfamilies up to 70 million years old. Giving their abundance (i.e. >17% of our DNA is made of L1 copies), L1 silencing is a major part of the epigenetic reprogramming that occurs in the early embryo and during development. In sum, the main results from EVOLINE provide critical information about how a balance is kept between guarding our genome from the deleterious effects of L1 mobilisation and allowing a background of activity likely convenient for evolution. The main results of EVOLINE are currently under publication (Sanchez-Luque et al. 2019, Molecular Cell, in second round of review).
Remarkably, the higher resolution in the analysis of locus-specific L1 elements revealed that DNA methylation is heterogeneous within cell populations. Even elements that are efficiently silenced are still unrepressed in a very low percentage of cells, which can only be detected with high resolution methodologies like the one we developed within EVOLINE (Nguyen*, Carreria*, Sanchez-Luque* et al. 2018; Schauer et al. 2018; Salvador-Palomeque*, Sanchez-Luque* et al. 2019). Thus, we conclude that, in addition to the somatic genome mosaicism generated by de novo L1 insertions, there is also an epigenetic mosaicism affecting polymorphic and fixed L1 insertions. My research within EVOLINE also suggested that this epigenetic mosaicism occurs in proliferating cells either in the healthy embryo (Salvador-Palomeque*, Sanchez-Luque* et al. 2019) or in tumours. The derived insertions could respectively impact the genome of entire subsets of cells in adult tissues or influence tumour progression. Indeed, by applying RC-seq, we found L1-driven insertional mutagenesis events shaping the genome of cancer cells, with some examples likely driven cancer progression during chemotherapy (Nguyen*, Carreria*, Sanchez-Luque* et al. 2018; Schauer et al. 2018). In some cases, we were able to identify donor L1 elements in the tumour samples and confirm that they were unrepressed to a variable extent within the cancer cell population. Remarkably, some of these elements were also unmethylated in few cells of the adjacent healthy tissue, suggesting that oncogenic cellular proliferation is potentially allowing the mobilisation of L1s unrepressed in the original somatic cells, which contributes to generate tumour genetic diversity and progression.