Periodic Reporting for period 4 - DeCRyPT (Deciphering Cis-Regulatory Principles of Transcriptional regulation: Combining large-scale genetics and genomics to dissect functional principles of genome regulation during embryonic development)
Reporting period: 2023-07-01 to 2024-06-30
Using the long-standing model, Drosophila, we are pioneering new ways to dissect regulatory elements, such as enhancers, that control gene expression. Regulatory elements must come into proximity to each other in precise 3D structures to activate the right genes at the right times. This complex dance is crucial for life, yet remains one of the most elusive puzzles in biology. Defects are often linked to human disease.
Our project is breaking new ground by using different genetic approaches to understand the functional properties of regulatory elements and the chromatin domains in which they reside. Aim 1 used natural sequence variation as a perturbation tool to dissect regulatory landscapes, combined with deep learning to uncover the sequence based rules. Aim 2 used deletions of regulatory elements to dissect their role in gene expression and genome topology. Aim 3 modified the regulatory context in which enhancers function by reshaping chromatin domains and learning the rules of their formation.
Our results have profound implications that extend beyond genome regulation, impacting society in meaningful ways. First, our research sheds light on which genetic variants might have detrimental impacts, which could help understand human disease. Second, by studying enhancer function, we gained new insights into how organisms cope with environmental changes, which has important implications for climate change. Third, we discovered a new mechanism to control the coordinated expression of genes of related function. Fourth, the methods we developed, e.g. to modify chromatin topology or delete combinations of regulatory elements, are powerful innovations that can be applied to any system. These examples underscore the far-reaching impact of our research, highlighting its potential to understand the natural world.
Aim 1 used population genetics to uncover new mechanisms of gene regulation. Using F1 embryos from 8 genotypes at 3 embryonic stages, we determined the impact of genetic diversity across different regulatory layers. Th comprehensive dataset enabled using combined haplotype testing and AI-based deep neural networks. This uncovered several surprises: (1) Genetic variation impacts gene expression more frequently than chromatin features. (2) Allelic imbalance in regulatory elements during embryogenesis is common and highly heritable. (3) Variation in RNA is more predictive of variation in H3K4me3 than vice versa. (4) The ability to buffer genetic variants is influenced by regulatory complexity. (5) The model revealed new partners for well-studied transcription factors, e.g. CTCF. (6) We solved the challenge of obtaining cell type specific effects of sequence variation. Aim 1 resulted in 3 publications (PMC7849415, PMC8734213, Sigalova et al. doi: 10.1101/2024.10.24.619975) with a 4th in preparation.
Aim 2 dissected functional regulatory domains through genomic deletions of regulatory elements in embryos, measuring their impact on genome topology and expression to assess properties like long-range regulation, enhancer (E) sharing and redundancy. (1) This systematic assessment of E deletion revealed extensive redundancy under normal conditions, but when the embryos are placed under environmental stress, these Es become essential (Dulja, in prep). (2) To identify examples of long-range or shared enhancers we performed high-resolution captureC to link enhancers to their potential target genes and identify potential long-range E-P pairs for deletion. This revealed an interesting switch in E-P communication during embryogenesis. (3) Functionally dissecting long-range chromatin loops made an unexpected discovery. Many are gene-gene loops, bringing 2 genes with related function together. Through deletions we showed that the loop is essential to coordinate their relative levels of co-expression. Aim 2 resulted in 2 publications (PMC11018526, PMID: 38157845) and 1 in preparation (Dulja et al.).
Aim 3 altered the regulatory context in which enhancers function by reshaping TADs and deciphered the pairing rules of TAD boundaries in Drosophila. (1) To determine when and how TADs are formed, we generated loss-of-function embryos (both maternal and zygotic) for CTCF, BEAF-32 and CP190, and determined their requirement for TAD structure and gene expression. Removing any one insulator had little global impact on the establishment of TADs, although some boundaries (10%) had reduced insulation. Our results suggest context-dependent redundancy and new insulator proteins. (2) To change the size and content of a gene’s regulatory domain, we used large-scale inversions and deletions which resulted in TAD fusion and mixing. This changed the context of the gene’s regulatory domain yet had surprisingly little effect on gene expression in most cases, although it does for some, opening many new research directions. (3) By performing large scale boundary insertions, we identified genomic elements that can form a new boundary in an ectopic location, and those that can’t. Known insulator binding is not predictive of working boundaries. Interestingly, many working boundaries are orientation and/or context-dependent. We identified new motif pairs highly predictive of working boundaries, including motifs for unknown factors and has important evolutionary implications for chromatin organisation (Varisco, Cavalheiro, in prep). Aim 3 resulted in 2 publications (PMC9897672, PMC7116017), and 1 in preparation (Varisco, Cavalheiro et al.).
Aim 1 used high resolution populations genetics in Drosophilids combined with haplotype tests and deep learning to generate predictive models of how sequence variation disrupts different aspects of gene regulation during embryogenesis. This was highly complementary to Aim 2, which deleted entire regulatory elements singly or in combination, in embryos to dissect the functional relationships between them and for gene expression in vivo. Aim 3 aimed to perturb regulatory landscapes by changing their topology. The resulting data provided new insights into the regulation of gene expression at multiple levels, including what it takes to make a boundary and how that impacts enhancer function, gene expression and embryonic development