Periodic Reporting for period 1 - PhaeoCREEvol (Regulatory sequence evolution during major transitions in complex multicellularity in the brown algal radiation)
Periodo di rendicontazione: 2023-10-01 al 2025-09-30
Brown algae represent the third most complex lineage on Earth and are vastly understudied relative to animals and plants. They evolved only ~250 million years ago, making them one of the youngest groups with complex multicellularity. Furthermore, extant brown algae exhibit extensive diversity in their complexity, ranging from relatively simple branched filaments (e.g. Ectocarpus) to complex three-dimensional thalli (e.g. kelps). Similar diversity exists in life cycles, sexual systems and genome sizes. Critically, there have been several independent transitions in complexity in the group. Taken together, the recent emergence and diversity of brown algae presents an elegant natural experiment in which to ask what genetic changes coincide with, and have potentially driven, transitions in complexity.
PhaeoCREEvol set out to answer this fundamental question. The project aimed to do this using two key methods. First, it would utilise comparative genomics analyses across the diversity of brown algae (~50 available genomes) to identify conserved noncoding sequences that likely have regulatory function (also known as cis-regulatory elements, or CREs). Second, it would employ multiomics approaches such as ATAC-seq to functionally identify CREs in five species of varying complexity. Combining these data, we would ask whether transitions to greater developmental complexity are associated with lineage-specific increases in CREs. We also aimed to ask how these CREs are acquired, specifically looking at the role of transposable elements, mobile genetic elements that are known to drive CRE evolution in animals and plants. Finally, we set out to ask whether distal gene regulation had emerged in the most complex brown algae using HiC data, mirroring the evolution of long-range gene enhancers in vertebrates and some angiosperms.
The projects main ambitions were to provide fundamental insights in evolutionary biology. Additionally, brown algae are of substantial ecological and economic significance, with brown algal forests forming keystone habitats across 25% of coastlines globally. PhaeoCREEVol also aimed to increase our understanding of brown algal genomics and genetics, bringing benefits to global biodiversity and food security.
Nonetheless, substantial progress was made with the two other aims of the project. Focussing on the model Ectocarpus, PhaeoCREEvol produced the first curated database of transposable elements for any brown algal species. This work revealed a substantial diversity of transposable elements, some of which encoded protein domains that had not previously been described in mobile elements.
The three-dimensional structure of six brown algal species was investigated using HiC technology. We found that the genomes of more complex species were organised into topologically associating domains (TADs), a landmark discovery in brown algal genomics.
PhaeoCREEvol also made fundamental contributions to our wider understanding of brown algal evolution. Phaeoviruses, a lineage of complex “giant” viruses, were known to integrate into brown algal genomes forming endogenous viral elements (EVEs). We showed that these EVEs are actively formed as part of a latent life cycle of phaeoviruses, and characterised how these viruses actively integrate their genome into that of their host.
Harnessing our new knowledge of brown algal transposable elements, we also discovered the structural basis of centromeres across the clade. Remarkably, most brown algal centromeres consist of a small cluster of the same transposable element family, representing co-evolution between the host genome and a mobile genetic element over hundreds of millions of years. The production of three genome assemblies belonging to species from previously unsampled major brown algal lineages was critical to these discoveries.
Finally, PhaeoCREEvol enabled collaboration in multiple additional projects that yielded impactful results during the course of the project. The success of the project resulted in the recipient achieving a tenure track position at the University of Melbourne, one of the world's top academic institutes.
We acquired substantial new understanding of how brown algal genomes of varying complexity are organised in 3D space. Using HiC data, we showed that more complex brown algal genomes feature Topologically Associated Domains (TADs), which are absent in more simple species such as Ectocarpus. This suggests that increased morphological complexity is associated with increased 3D organisation of the genome, which may translate to increased complexity in gene regulation. The HiC datasets also enabled us to produce chromosome-level assemblies for six brown algal species, and to characterise the history of major genome rearrangements in the clade. We showed that the relatively simple morphologies of the Ectocarpales group are associated with a substantial increase in genome rearrangements, suggesting that syntenic relationships between genes may be more tightly conserved in more complex species.
We demonstrated that phaeoviruses actively integrate their giant genomes (hundreds of kilobases) into their brown algal host genomes during infection. We characterised the enzyme and mechanism of integration, which involves a tyrosine recombinase similar to those found in latent phages. The discovery of latent giant viruses in a multicellular eukaryote host is unprecedented and represents a landmark in virology.
Centromeres have recently emerged as one of the most exciting research fields in genomics. PhaeoCREEvol characterised the locations and structural components of centromeres across brown algae, discovering that most species possess centromeres formed of a single family of LTR retrotransposons. This discovery represents one of the most ancient associations between a specific transposon family and centromeres, and is a striking case of evolutionary convergence with the chromovirus LTR elements that are found at the centromeres of many angiosperms. We also showed that centromeres are hotspots of genomic rearrangements that have become fixed in particular brown algal lineages.