Skip to main content

Transposable element Impacts on Gene Expression and Regulation

Periodic Reporting for period 1 - TIGER (Transposable element Impacts on Gene Expression and Regulation)

Reporting period: 2015-10-19 to 2017-10-18

Transposable elements (TEs) are DNA sequences that are able to spread within and between genomes. TEs have been described in most species analyzed to date, constituting up to 90% of a host genome. While transposition may lead to evident harmful effects, TEs can also positively impact the host genome by, for instance, donating intrinsic regulatory elements such as promoters. Hence, TEs are responsible for formation and rewiring of gene regulatory networks and therefore may play a role in species diversity. The main goal of this research project is to understand the regulatory changes that TEs engage within host genomes. Some host species show higher rates of TE activity than others as observed in Drosophila, where TEs are thought to be responsible for 80% of the visible spontaneous mutations in contrast with 0.1% in humans. While many examples of TEs acting as regulatory sequences have been described in mammals and plants, very few cases were observed in Drosophila, mainly due to strains studied and methods used.The overall objective of this research project is to understand the extent of TE-derived regulatory sequences in different Drosophila species and strains and their impact in host gene expression and regulation. I planned to uncover gene expression differences between D. melanogaster strains due to TE-derived transcription start sites, and to study the epigenetic regulation of such promoters. In addition, I also included a new in silico research on the presence of TE sequences within transcripts obtained from RNA sequencing paired end data sets from both D. melanogaster and D. simulans wild-derived populations. The usage of wild-derived strains in genome-wide studies is completely novel. While most researchers to date have based their analysis on TE copies present in D. melanogaster sequenced strain, or have simply analyzed single TE copies, this proposal combines different approaches in order to completely grasp the impact of TEs on transcriptome diversity: 1) the usage of insertionally polymorphic copies (copies present in one strain and absent from another) and 2) genome-wide datasets. This project emerges from a very dynamic field of research, where only very recently TEs were shown to act as promoters in Drosophila. The findings of this project should be broadly significant for the fields of Drosophila genetics, transposable element biology, host defence and evolution.
"The D. melanogaster strains studied were recently collected in France and in Brazil. Two novel methods were developed in the host laboratory: chromatin immunoprecipitation (ChIP) and RAMPAGE. In Drosophila, active chromatin is often associated with trimethylation of lysine 4 in histone 3 (H3K4me3), while repressive chromatin and more importantly, repressed TE sequences, are associated with trimethylation of lysine 9 of histone 3 (H3K9me3). Cross link chromatin immunoprecipitation was performed in drosophila embryos and the genome-wide sequencing data should be available in the following months. RAMPAGE is an assay that produces genome-wide libraries of transcripts with transcription start sites (TSS) intact, originally performed in Drosophila samples. I have troubleshooted RAMPAGE in our strains and in order to have matched epigenetic libraries (CHIP-seq) and RAMPAGE libraries, I have waited to perform RAMPAGE in the same pool of tissues where the chromatin was extracted for the ChIP analysis. Hence, the RAMPAGE libraries are currently being made and will be sent to sequence in the next weeks.
During the troubleshooting period of the ChIP and the delays imposed by technical problems, I have developed two in silico projects: the first one aims at deciphering the role of TE sequences in differential expression of genes between Drosophila strains but as opposed to the previous proposal, this project will detect TSSs, exonizations and truncations. The second project aims at uncovering gene expression differences between strains of Drosophila simulans that are due to differential piRNA production. PiRNAs are well known regulators of TEs. Both projects are novel and took advantage of genome-wide data produced in the host laboratory allowing immediate use. We found that less than 2% of genes contain a TE sequence within its transcripts. Interestingly, ""chimeric genes"" present higher expression than genes lacking any TE sequences and in addition an over representation of chimeric genes is found in upregulated compared to stable or downregulated genes between drosophila strains. We are currently confirming such results in candidate genes that are upregulated and harbor TE sequences within their transcripts. In sum, this first in silico project suggests that in Drosophila, TEs could potentially increase the expression of host genes. Concerning the second in silico project, we have shown that small RNAs of 23-29bp with a ping pong signature are potentially able of targeting host genes. Interestingly, these piRNAs do not necessarily map to a TE sequence, suggesting either that the TE sequences are too degraded or that genes are able to produce their own piRNAs. But the most curious finding is that most genes are targeted by such piRNAs, with or without TE similarities. These piRNAs target coding domain sequences and are depleted from 5' untranslated regions, where most regulatory sequences are found. We are currently mining for gene expression between populations differences due to these piRNAs."
As a conclusion, despite the proposal being shortened by almost a year due to conflicting schedules with a permanent position, the development of two novel and very competitive projects have greatly improved the outcomes and potential publications of the initial proposal. With the analysis of both ChIP and RAMPAGE datasets by the end of the year along with the two in silico projects, this proposal will generate at least three research articles on the impact of TEs on the expression and epigenetic regulation of genes in Drosophila. We have produced for the first time an epigenetic and TSS map of D. melanogaster strains which will unravel the chromatin marks associated TE copies, notably TE-derived promoters. Population analysis in genome-wide context are rare but mandatory if we want to understand population dynamics and evolution. We hope these three projects will open the door to other researchers to include more strains in their projects in order to account for genetic and epigenetic variability.
The two in silico projects will have published public pipelines that can be used by other researchers working on drosophila or other model species. In addition, while this project focuses on identifying TE derived TSSs, the data generated has many applications. There are several ways changes in gene expression can occur. By mapping all promoters, characterizing all the transcripts we will acquire extensive information on any promoter responsible for gene expression differences that may exist in D. melanogaster strains. These data can also be used to quantify TE expression differences between strains along with epigenetic regulation of such TEs. In conclusion, the data generated by this project should be of significant value for many researchers and will be the first genome-wide promoter and chromatin mapping survey in different wild-derived strains of D. melanogaster. In sum, the objectives proposed by this project are only the beginning of the exploitation of the data produced.
Impact of transposable element regulatory sequences on nearby genes