Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Deciphering the evolution and roles of cytosine DNA methylation across eukaryotes

Periodic Reporting for period 2 - METHYLEVOL (Deciphering the evolution and roles of cytosine DNA methylation across eukaryotes)

Reporting period: 2023-03-01 to 2024-08-31

DNA methylation is the modification of the basic code of DNA (A, T, C, G) by the addition of a methyl group. This chemical modification predominantly occurs only on Cytosines. Methylated cytosines for the most part act like normal nucleotides, as they are still bound to the other strand of DNA and encode for the same amino acids, however, they can change the way the genome is read. In some species, having a methylated cytosine will prompt silencing of that region, which will not be transcribed. However, on other lineages, cytosine methylation seems to be found in transcribed regions. In vertebrates, which includes mammals, cytosine methylation has become the default state, and most cytosines followed by a guanine are methylated constantly. This presents an evolutionary conundrum, as a seemingly indispensable part of gene regulation is very variable across evolution. If this modification of DNA can have such important roles, and we can trace its origins back to the ancestor of cells with a nucleus (the eukaryotes), we need to have the broadest sampling of what it is doing across various lineages to figure out why it is so variable, and when its various roles emerged. Finding which organisms and lineages have cytosine methylation is an important output of this project, but we also need rigorous approaches to test what is it doing in these lineages. This is what this projects aims to do, bring the state of the art techniques commonly used in epigenetic research of mammalian cells to the broadest possible diversity of eukaryotes. This includes a lot of previously neglected lineages of unicellular eukaryotes, known as protists, as well as an emergent model system in evolutionary developmental biology, the sea anemone Nematostella vectensis. Using a wide range of techniques, we are solving how this epigenetic mark evolved, and what are its impact on gene regulation in some of the states not characterised before. One of the complementary objectives is to understand how the group of proteins that evolved alongside DNA methylation to be able to read and interpret this mark emerged. We know that some proteins are capable of exclusively binding to methylated cytosines in mammals and plants, but it is currently not know how these proteins evolved. Ultimately, this multi-disciplinary projects will fill in a major question in evolutionary biology, questioning the origins of an important mechanism of gene regulation.
This work is in principle basic research, aimed at questioning our assumptions on an important biological process that is currently critically studied to understand cancer and development in mammals. By using non-conventional model systems we will further characterise what is unique of this gene regulatory mechanism in mammals, and why it evolved to achieve such complex roles, including imprinting, X-chromosome inactivation and control over genomic parasites. In parallel, mechanistic understanding on alternative DNA methylation pathways in eukaryotes would bring potential biotechnological innovations. Some of the genes that can read or write DNA methylation on distantly related organisms can be applied in mammalian synthetic biology approaches, superimposing a new code on top of DNA without modifying the underlying sequence.
During the first half of the project, we have first established the cultures of many species in the laboratory. We are growing a range of unicellular eukaryotes, whose common ancestors lived more than 1 billion years ago. Eukaryotic diversity is often perceived in a very biased perspective, as only the large multicellular lineages are taken into account. This include animals, plants and fungi. However, these are 3 branches within the eukaryotic tree of life, which is much more diverse. Some of these unicellular lineages have been studied so far as they are economically important for humans, as they can be parasites and cause health issues (e.g. malaria). However, many of the parasites have reduced genomes, and they have diverged a lot from their ancestral states. We aimed species from lineages that have been neglected in this type of gene regulation approaches, but that represent important lineages in eukaryotic diversity. Since we can grow them in the laboratory, we can obtain the DNA from these species, and profile their DNA methylation landscapes using state of the art technologies. One of the methods that we are currently developing to do this is the Oxford Nanopore technology. This is a revolutionary technique that allows sequencing primary DNA information, but at the same time obtaining the modifications it might contain, such as cytosine DNA methylation. While it is mostly applied to study mammalian DNA when used for base modifications, we are now using regularly in a wide range of taxa, validating its general use. Since methylation patterns are lineage specific, being able to use Nanopore to sequence and profile DNA methylation, it will be critical to disentangle complex samples. For example, when we sequence new genomes, we have encountered that some bacterial commensals or symbionts that live with the main species will present completely different methylation patterns than the eukaryote DNA, which simplifies a lot the separation of both DNA sources. As part of this exploration on DNA Methylation patterns, we have found a unique example of DNA methylation usage in a unicellular species closely related to animals. This species had large chunks of the genome fully methylated, and when we looked at them more closely we realised they were giant virus DNA. Giant viruses are large viruses in both size, some were originally confounded by bacteria, and genomic length. They were discovered about two decades ago, yet we still do not understand a lot of their biology. In this unicellular eukaryote we study, DNA methylation allowed the integration of this potentially lethal viral DNA into the host many times, making up to 14% of their genetic content. This implies that host and virus DNA are mixing recurrently, although DNA methylation is mediating this potential conflict by silencing the foreign DNA. Still, some of the genetic material from the virus might get used at some point by the host to perform useful functions, in a process that is probably gradual. This process of genetic domestication of viruses is an emerging common major theme in the evolution of eukaryotes, so our contribution is that DNA methylation probably is an important step in this flow of genetic information.This story is currently available in the preprint server BioRxiv for the community to use, as well as being under consideration in a peer reviewed journal.
This project has gone beyond the state of the art in several aspects. One and most importantly, we have used a curated group of model systems to cover a lot of neglected taxa. Assuming that biological mechanisms described in a given lineage is the same in others can be a safe assumption for some molecular processes, but for DNA methylation, this is not the case. Therefore, we have found a range of methylation patterns and roles previously not even proposed within the community studying DNA methylation. Also, through studying this diversity, we found that some features that were probably ancestral in eukaryotes, therefore critical for the early success of cells with a nucleus in a predominantly bacterial world, have diverged in some lineages. This implies that animals or plants, for instance, are divergent from this ancestral state, and therefore the features we find in these lineages are probably an adaptation to multicellularity, and not just reflecting inheritance from early eukaryotes.
We have also brought technological advancements to the field. While many studies limited their sampling of DNA methylation to antibodies and bisulfite conversion based techniques, each with their inherent biases, we have adopted a couple of new emerging technologies that provide orthogonal validation and ease of use. This include Enzymatic Methyl-sequencing, which allows to profile cytosine methylation without the damaging potential of bisulfites treatment. However, some reports suggested that Enzymatic Methyl-seq could be biased by DNA sequence composition. We confirm that is rarely the case in our samples, which contain very variable methylation contexts and patterns, and come from highly divergent organisms. Furthermore, we have included the latest advancements in Nanopore sequencing technologies to the study of DNA methylation across species. This sequencing technique is being adopted for genome sequencing for its flexibility and ease of use, however, its application to the study of base modifications was mostly restricted to mammalian cells. We validate this technique in many new lineages, opening the gates for further applications of this promising technology to any new genome.
Some of the goals and objectives are still ongoing. Despite the preliminary data is exciting, we need to validate our findings to corroborate these observations. However, we foresee that at the end of this project, our understanding of DNA Methylation evolution and its roles will change drastically. Given the large number of researchers that use or could use DNA methylation information in their research, a proper and rigorous understanding on what it might be doing in their model system of choice is crucial for correct interpretation of their data.
My booklet 0 0