Skip to main content

Population genomics and experimental evolution of ribosomal RNA gene variants in Arabidopsis thaliana

Periodic Reporting for period 1 - rDNAevol (Population genomics and experimental evolution of ribosomal RNA gene variants in Arabidopsis thaliana)

Reporting period: 2017-04-01 to 2019-03-31

The central importance of ribosomal RNA genes (rDNA) for our understanding of biology cannot be overstated: they are evolutionarily the oldest genes, they are the most highly expressed genes in any organism, and their expression is central to cellular growth. Because of the requirement for large quantities of rRNA, eukaryotic genomes contain clusters with hundreds to thousands of rDNA copies arranged in tandem. Despite their high copy number, there is little sequence variation across all rDNA genes within an individual and across individuals in a given species, due to the still mysterious process of concerted evolution. Since not all rDNA copies are expressed, we can already suspect that selection cannot act directly on all rDNA copies.

rDNAevol will take full advantage of the diversity of genetic resources available in the model plant Arabidopsis thaliana to study concerted evolution in the context of silent or active rDNA copies. Specifically, rDNAevol’s key scientific aims are the following:

(1) Perform a population genomic analysis of the sequence variability within silent and active rDNA clusters.
(2) Generate targeted induced mutations in rDNAs by genome editing.
(3) Describe the fate and fitness of both natural rDNA variants and newly induced mutations.
"The project has been marked by a combination of meaningful advances, a spinoff publication of partial results, and methodological pitfalls.

In regard to aim (1), I have exceeded my ambitious goals, and already nearly completed Next-Generation Sequencing (NGS) data acquisition from a vast collection of experimental populations for the population genomics analysis. Similarly, I have re-generated high-coverage NGS data for the founder parental lines of the experimental populations; something not originally envisioned at the beginning of the project, but became necessary for a proper analysis of low-frequency rDNA variants. An analysis of genetic variation in 5S rDNAs in Arabidopsis 1001 genomes spun into a collaboration with the group of Dr. Aline Probst from the ""Genetic Development and Reproduction” (GReD) research unit in Clermont-Ferrand (France), and has been recently published in the journal Nucleic Acids Research (Simon et al., 2018; doi: 10.1093/nar/gky163).

In regard to the methodological aim (2), the work has been postponed, because new, optimized vectors have been assembled by the host lab for CRISPR-Cas9 induced-mutations, to streamline the isolation of transgene-free events. In parallel, an amplicon NGS based pipeline for rapid identification of mutations in hundreds of lines has been built. Both are now fully operational, and I can start with the proposed mutagenesis scheme.

In regard to aim (3), I am only waiting for the conclusions derived from the analysis in aim 1 to select a subset of strains for the experimental evolution study.
Although our understanding of the evolution of rDNA repeats has improved over the last decades, no attempt to study it in the context of silenced or active chromatin has been done. Since I believe engineered ribosomes –at the rDNA level– in multicellular eukaryotes are a non distant possibility, a clear understanding of the fate and fitness of induced mutations at these crucial genomic loci is absolutely necessary for the successful development of stable strains.
Cartoon of the repetitive structure of ribosomal DNA loci.