DNA methylation is the modification of the basic code of DNA (A, T, C, G) by the addition of a methyl group. This chemical modification predominantly occurs only on Cytosines. Methylated cytosines for the most part act like normal nucleotides, as they are still bound to the other strand of DNA and encode for the same amino acids, however, they can change the way the genome is read. In some species, having a methylated cytosine will prompt silencing of that region, which will not be transcribed. However, on other lineages, cytosine methylation seems to be found in transcribed regions. In vertebrates, which includes mammals, cytosine methylation has become the default state, and most cytosines followed by a guanine are methylated constantly. This presents an evolutionary conundrum, as a seemingly indispensable part of gene regulation is very variable across evolution. If this modification of DNA can have such important roles, and we can trace its origins back to the ancestor of cells with a nucleus (the eukaryotes), we need to have the broadest sampling of what it is doing across various lineages to figure out why it is so variable, and when its various roles emerged. Finding which organisms and lineages have cytosine methylation is an important output of this project, but we also need rigorous approaches to test what is it doing in these lineages. This is what this projects aims to do, bring the state of the art techniques commonly used in epigenetic research of mammalian cells to the broadest possible diversity of eukaryotes. This includes a lot of previously neglected lineages of unicellular eukaryotes, known as protists, as well as an emergent model system in evolutionary developmental biology, the sea anemone Nematostella vectensis. Using a wide range of techniques, we are solving how this epigenetic mark evolved, and what are its impact on gene regulation in some of the states not characterised before. One of the complementary objectives is to understand how the group of proteins that evolved alongside DNA methylation to be able to read and interpret this mark emerged. We know that some proteins are capable of exclusively binding to methylated cytosines in mammals and plants, but it is currently not know how these proteins evolved. Ultimately, this multi-disciplinary projects will fill in a major question in evolutionary biology, questioning the origins of an important mechanism of gene regulation.
This work is in principle basic research, aimed at questioning our assumptions on an important biological process that is currently critically studied to understand cancer and development in mammals. By using non-conventional model systems we will further characterise what is unique of this gene regulatory mechanism in mammals, and why it evolved to achieve such complex roles, including imprinting, X-chromosome inactivation and control over genomic parasites. In parallel, mechanistic understanding on alternative DNA methylation pathways in eukaryotes would bring potential biotechnological innovations. Some of the genes that can read or write DNA methylation on distantly related organisms can be applied in mammalian synthetic biology approaches, superimposing a new code on top of DNA without modifying the underlying sequence.