Skip to main content

Coordinated regulation of transcription and pre-mRNA splicing at gene promoters

Final Report Summary - PRE-MRNA SPLICING (Coordinated regulation of transcription and pre-mRNA splicing at gene promoters)

Pre-mRNA splicing is the process by which non-coding RNA sequences (introns) are removed of the transcript and the coding sequences are joined in what will form the messenger RNA (mRNA). This is an essential step in gene expression because about 90 % of pre-mRNA is intronic and does not code for protein. Very often, exon usage is alternative, that is, the cell decides whether to remove a part of the pre-mRNA as an intron or include this part in the mature mRNA as an alternative exon. This process, termed alternative splicing, enables the expression of structurally and functionally distinct protein isoforms from a single gene. However, it is still largely unknown which splicing factors are responsible for inclusion / exclusion of a particular exon. In this proposal, I set out to identify the targets of an important family of splicing regulators, the SR proteins, to determine their role in regulation of co-transcriptional splicing.

To determine the targets of SR proteins we depleted each of the seven canonical SR proteins: SF2, SC35, SRp20, SRp75, SRp40, SRp55 and 9G8 (SRSF1-7) in mouse P19 cells using rna interference. Depletion of each individual SR protein had little or no impact on the expression of the other SR protein family members. We then evaluated the global effects of SR protein depletion on alternative splicing and total RNA expression using next-generation sequencing. For that, polyA+ mRNA from each knockdown was subjected to RNA-seq, using the Illumina HiSeq 2000 plattform. The reads were mapped to the mouse genome and assigned to features (genes or exons) with HTseq-count. Read counts were then used in edgeR or DEXSeq to determine, respectively, differential gene expression or exon usage upon knockdown.

SR proteins showed marked differences in their gene regulatory potential. For example, SRSF3 appears to regulate the abundance of the largest number of genes / exons, whereas regulate SRSF4-5 a only few. Although some SR proteins share targets, most are by and large unique. This supports the notion that studies focusing on SR protein should be tailored to each member of this family of proteins. Our results show that SR proteins regulation of gene expression is complex and can be observed both at the level of alternative splicing and total gene expression.

During the course of the project, we have also created different stable cell lines in mouse P19 and human K562 expressing each SR protein under control of their native promoters tagged with GFP. The goal was to determine the genomic location of SR proteins that their association with gene promoter and putative role in regulation of transcription. For that, we used ChIP-seq but failed to observe enrichment of SR proteins at chromatin. Nonetheless, the created cells lines are great resource for our group, and others, to do, for instance, comparative studies of SR protein functions. Furthermore, as P19 cells are pluripotent, and thus we have now available a great resource for the study SR protein in neuronal or muscle differentiation.

During transcription, mRNAs undergo additional modifications to alternative splicing. Another mRNA modification is 5' end capping, which happens as soon as the nascent transcript emerges out of the RNA exit channel of RNA polymerase II. This process is essential for efficient gene expression and plays a critical role in all aspects of the life cycle of an mRNA molecule: the cap structure is required for optimal splicing and translation. We have determined for the first time the genome-wide localization of a component of the nuclear cap binding complex, CBP 20, by chromatin immunoprecipitation and deep sequencing (ChIP-seq). To gain insights into the regulation of gene expression: compared CBP20 ChIP profile to with the gene occupancy of RNA polymerase II and RNA polymerase III:

(i) to establish the frequency of 'productive' RNA pol II transcription;
(ii) and to determine whether pol II occupancy near pol III genes coincided with capped transcripts.

The latter point follows up on a previous study in the lab, which showed that pol II activity upstream of pol III genes regulates transcription of U6 snoRNA genes.

In a wider context, our results are also useful for biomedical research. Expression of some SR proteins is mis-regulated in cancer but the consequences in terms of gene expression and alternative splicing are still unknown. This reflects one of difficulties of the field of alternative splicing: matching a particular splicing event to the regulating factor. We hope the datasets generated in this project, soon to be published, will allow other researchers to investigate particular events of abnormal gene expression and the regulation by SR proteins in the context of diseases.