European Commission logo
English English
CORDIS - EU research results
CORDIS

The role of RNA in centromere biology and genome integrity

Periodic Reporting for period 4 - cenRNA (The role of RNA in centromere biology and genome integrity)

Reporting period: 2021-01-01 to 2022-06-30

During mitosis, chromosomes need to be divided equally to two daughter cells in order to maintain genome stability. Any loss of genetic information leads to aneuploidy, genome instability, and genetic diseases including birth defects or cancer. It is therefore crucial to molecularly understand in full detail how chromosome segregation is regulated. The crucial genomic locus for genome stability is the centromere on every chromosome since kinetochore formation takes place at specialized chromatin structure and the spindle microtubule attachment requires the kinetochore to create forces to pull the chromsomes apart during mitosis. Centromeric chromatin structure is highly conserved but the underlying DNA sequence is very variable, and it has long been accepted that centromeres are defined epigenetically by their specialized chromatin composition instead of any specialized DNA sequence. The epigenetic determinant of centromeric chromatin is the high occupancy of the histone H3-variant CENP-A. Centromeric chromatin is embedded in large blocks of so called pericentromeric heterochromatin, highly condense and repetitive DNA sequences that stretch over megabases of the genome and make up about 4% of the human or Drosophila genome. These repetitive regions have been ignored for a long time because of its repetitiveness and its lack of protein coding genes. But in recent years, researchers noticed that heterochromatin plays important roles in genome maintenance. We previously found that repeat regions from pericentromeric chromatin of Drosophila melanogaster are transcribed in mitosis and that these transcripts are important for genome stability. These transcripts do not code for proteins but belong to the class of non-coding RNAs. These RNAs are involved in diverse processes from transcriptional and translational control to immune response and developmental regulation. Important for the projects described in this proposal is that non-coding RNAs have emerged as crucial components of chromatin structures and have implications in epigenetic processes. Identifying the nature and function of transcripts that associate with or derive from (peri-) centromeric repeat regions and their interacting proteins is the primary aim of this ERC consolidator grant. The major objectives of our work on centromeric RNAs is to, firstly, identify all RNAs that associate and therefore potentially influence centromeric chromatin. Since regulatory RNAs usually function in complex with proteins, we also set out to identify and functionally characterize proteins that localize to centromeric chromatin through RNA. Last but not least, it is important to identify if these RNAs or the protein-RNA complexes are misregulated in different human diseases. If we fully understand their regulation under physiological conditions we hope to then also understand how their misregulation, for instance in certain cancer entities, lead to disease progression and will ultimately help us to progress to disease prevention. This ERC grant has given us some important insights into how RNA transcripts are regulated at centromeres and how these RNAs influence the function of centromeres and the inherticance of centromeric chromatin.
The major objective of this ERC CoG is to identify and functionally characterize centromeric RNA and how it influences centromere function. To do so, I have defined 5 major objectives and my team has addressed all aspects of the five objectives. In Objective 1 we focused on the identification of SATIII RNA binding partners and identified among many five SATIII RNA binding proteins that were largely uncharacterized in Drosophila. We identified that these proteins form a complex that localizes to the nucleus and the nucleolus and interact with SATIII RNA. We charactered their molecular function in general and in combination at centromeres and with centromeric RNA in particular. We found that this novel complex is involved in the repression of SATIII RNA transcription in the germ line and that this repression is essential for germ line development and general embryonic development. The revised manuscript is under consideration at the moment. In Objective 2 we focused on structural aspects of centromeric RNA. We optimized the purification of sufficient centromeric proteins for hydrogen/deuterium (1H/2H)-exchange in combination with high-resolution mass spectrometry to identify how SATIII RNA influences the folding of its interacting protein(s). We have zoomed into CAL1, the loading factor of the epigenetic determinate CENP-A and its ability to bind cenRNAs. This project is still ongoing. Even though we have identified some structural featured that are influence by RNA binding, the precise role of the changes RNA dependent structure is still unclear. The epigenetic aspects of SATIII RNA have been a major focus of Objective 3 and we identified the importance of SATIII RNA during spermatogenesis. We have found that SATIII RNA is inherited through both the male and female germ lines and influence development as a classical epigenetic determinant. We investigated the loss or misexpression of SatIII RNA on early development and CENP-A loading in the germ line using newly created fly lines that misexpress SATIII RNA or have the genomic region of SATIII deleted. The aim of Objective 4 and 5 was to identify novel RNAs at centromeres. We had to develop an approach to isolate the RNAs from early Drosophila embryos since cultured cells did not give us any clear results. The embryonic data gave us a strong enrichment of transcripts from several transposable elements. The centromeric function of these TE-elements is still under debate and hard to dissect since the identified TEs are not specific for centromeres and therefore, dissecting a centromeric from non-centromeric function has been challenging. We are still working on this project with the aim to finalize some functional analysis. If we are unsuccessful we will publish the identification procedure and the nature of the TEs since this seems to be a conserved feature and interesting from an evolutionary point of view. Importantly however, we have identified that some of these TE-RNA bind centromeric components. In addition, we have produced a massive large-scale meta-analysis pipeline with more than 45,000 deep sequencing data set so we could bioinformatically identify novel RNAs and proteins at centromeres, pericentromers, and any other repetitive region of the genome, which has resulted in a massive undertaking with a wealth of novel information about repetitive region, which we will dissect in the years to come.
The main progress beyond the state of the art and what we expected was the results from our computational analysis. We re-analysed around 45,000 ChIP seq data sets and found a treassure drove of novel factors that interact with centromeric chromatin, especially important some transcription factors and chromatin remodelling factors that have inhibitors in clinical trials without the appreciation that these are factors that influence centromeric chromatin. We will work on the characterization of many of these factors for many years to come.
centromeres on mitotic chromosome

Related documents