Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Dissecting the molecular building principles of locally formed transcriptional hubs

Periodic Reporting for period 1 - DisMoBoH (Dissecting the molecular building principles of locally formed transcriptional hubs)

Reporting period: 2021-09-01 to 2023-08-31

To function properly, each cell in our body requires a set of genes to be translated into proteins in the correct amount. To achieve this task, our cells utilize a complex network of regulatory sequences – short fragments embedded in our genome that do not code for proteins. These regions recruit a subset of proteins called transcription factors, which in turn establish communication between regulatory sequences and the downstream genes whose synthesis they control. How exactly the communication between different regulatory parts of the genome is coordinated by transcription factors, however, remains unknown.
The project addresses this question in two ways: The first approach used a large dataset summarizing the regulatory activity of all transcription factors found at a given genomic region, and compared this activity across a set of genetically diverse individuals. Since each human harbors a specific set of DNA mutations that make their genomes unique, regulatory activities can vary from individual to individual as a function of these mutations. Leveraging this information, the project investigated what composition of transcription factors predisposes a regulatory region to become sensitive to mutations.
An important finding is the discovery of a subset of ‘cooperativity-enhancing’ transcription factors that boost the activity of many others, thus turbo-charging effects introduced by DNA mutations. The same set of cooperativity-drivers are also important for the establishment of communication between two or more regulatory regions.
As certain DNA mutations can predispose individuals to diseases, or initiate malignancies such as cancer, the insights gained by the first part of the project may help researchers to not only identify malignant mutations faster, but to also interpret their mode of action. The latter is an essential first step to personalize treatment plans.
The second part of the project is devoted to the development of a new methodology that can detect communication of distant regulatory regions with their target genes, while simultaneously measuring gene output. The approach allows researchers to test what effect DNA mutations in regulatory regions, for example those found in certain cancers, have on gene synthesis. As the method can test many different conditions at once, it provides mechanistic insights into which regulatory networks are involved in the process. The insights obtained from such experiments will improve our understanding of why and how mutations in the regulatory parts of our genome trigger disease. Once mechanisms are understood, they can be targeted by therapeutics.
As such, the overall objective of this proposal was to develop the necessary molecular tools that will allow us to better understand how DNA mutations ultimately lead to disease and what vulnerabilities in disease-linked regulatory networks can be exploited for the development of targeted therapies.
The developed method was applied to a disease-linked genomic region, the AXIN2 gene, that harboured a small mutation, which is linked to the progression of chronic lymphocytic leukaemia. By systematically probing which transcription factors lead to communication between the mutated and the gene synthesis region, the method revealed that a combination of transcription factors is required to activate AXIN2 synthesis. In conclusion, the approach was applied to investigate the mechanism underlying a distinct disease-linked genomic region and provided further insights into how communication between regulatory regions and their target genes is mediated by transcription factors.
For the method development, I first generated a human B-cell derived cancer cell line that harbored a landing pad for CRE recombinase-mediated cassette exchange using CRISPR-Cas9 genome-editing. Next, I designed a workflow to generate libraries spanning several kilobases in length and which can be efficiently cloned into target-vectors for cellular delivery. For the final approach to work, I also developed a readout method that allows me to map mutations introduced to several locations within a target library to random DNA barcodes that track the expression levels of each fragment. For this I used a long-read sequencing platform combined with a custom computational analysis platform.
Next, I established the entire experimental workflow to readout expression levels of diverse libraries containing enhancer and promoter pairs at distances of over two kilobases, when integrated into the human cancer cell line characterized above. This included transfection and sorting strategy, extraction of genomic DNA and the corresponding mature RNA, and finally custom library preparation and sequencing. The workflow is now fully established and the corresponding publication together with a detailed experimental method section is expected to be released by the end of the year. Similar to the project described below, we will make the publication available on open-access servers, such as BioRxiv and provide a link on the lab website.

For the mechanistic insight part of the project, I performed an in-depth computational analysis on what sequence cues within enhancers harboring chromatin accessibility quantitative trait loci (caQTL) renders the mutations causal. I identified a set of transcription factors (TFs) specific to the caQTL enhancer context, which boost the effects excerted by pioneer and activating TFs co-bound to the same locus. Using a series of computational and experimental approaches, I could show that context-only TFs lead to cooperativity in enhancer activity assays, have disordered protein domains, are associated with coactivators such a Bromodomains, and link enhancers that display genetic coordination among molecular phenotypes. Using an experimental setup that compares enhancer activity on a plasmid compared to an endogenous setting, I could further show that context-only TFs need pioneering TFs to gain access to their binding sites, thus explaining why at caQTL enhancers combinations of the two types of TFs are selected for. In a final set of computational analyses, I could further demonstrate that the environment created at caQTL enhancers and specifically at genetically linked pairs of enhancers is due to transcriptional hub formation which concentrates numerous regulatory factors at these enhancers.
The link to the open-access version of the publication can be found here: https://www.biorxiv.org/content/10.1101/2023.05.05.539543v1(opens in new window) and a summary and the code at the lab’s github page: https://github.com/DeplanckeLab/Context-TFs(opens in new window).
The project achieved all goals originally proposed and the results are expected to be fully disseminated by the end of the year. In addition, the computational analysis provided insights going beyond the originally proposed methodology, identifying a generalizable building principle of how enhancers achieve different functionalities. As such, I expect the insights to be useful for a variety of other disciplines, including Genetics, Developmental and Systems Biology. The developed methodologies, both computational and experimental serve as roadmaps of how to investigate the mechanisms underlying gene regulatory networks, as such their impact should be far-reaching. However, given the mechanistic nature of the study, impact at the societal level is expected to occur with considerable time delay, as is commonly the case for basic research.
Summary Figure
My booklet 0 0