Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Decoding and controlling cell-state switching: A bottom-up approach based on enhancer logic

Periodic Reporting for period 4 - cis-CONTROL (Decoding and controlling cell-state switching: A bottom-up approach based on enhancer logic)

Reporting period: 2021-12-01 to 2023-05-31

Deciphering the genomic regulatory code of enhancers is a key challenge in biology because this code underlies cellular identity. However, there is a significant gap in our understanding of how the DNA sequence of an enhancer encodes the specificity for transcription factor binding, enhancer function, chromatin accessibility, and ultimately target gene regulation. Achieving a better understanding of how enhancers work is important for society because it will improve the interpretation of non-coding genome variation, both in normal and in cancer genomes, and it will empower the generation of cell type specific drivers to manipulate cell types, with applications for regenerative medicine and gene therapy.
In this project we are addressing the enhancer decoding challenge from a single-cell perspective, broadly termed as single-cell regulatory genomics. Particularly, we are using a combination of in vivo massively parallel enhancer-reporter assays, single-cell genomics on microfluidic devices, computational modelling, and synthetic enhancer design. We first apply this strategy to decipher the enhancer code in melanoma, a relevant case study due to the presence of distinct melanoma cell states, and the finding that cells can switch between these states. We also broaden our experimental model systems by looking at epithelial-to-mesenchymal transition, as well as wound response models. Several technological components of this project include the establishment of droplet-based single-cell techniques (e.g. single-cell ATAC-seq and multi-omics); and the development of novel algorithmic and AI approaches to analyze single-cell epigenomics data, and to decipher enhancer logic. Ultimately, our aim is to build enhancer models that can be used to (1) prioritize enhancer mutations in cancer and normal genomes, and (2) to design and optimize enhancers that drive cell type specific reporter activity, so that they can be used in gene therapy applications, for example to activate a suicide gene in a specific cancer cell state.
We obtained significant results in enhancer decoding using a combination of computational and experimental techniques, on a variety of biological model systems. As the first model we used human melanoma, where we studied the regulatory code underlying the two main transcriptional states, namely the melanocytic (MEL) and the mesenchymal-like (MES) state, and the dynamic switch from MEL to MES. We profiled changes in chromatin accessibility (scATAC-seq) and transcriptome (scRNA-seq) at single-cell resolution and used computational techniques including topic model (cisTopic) and GRN inference (SCENIC and SCENIC+) to identify key transcription factors, genomic enhancers, their constellation of TF binding sites, and the target genes they regulate. We furthermore discovered an intermediate melanoma state and described its regulatory network and collaborated with the JC Marine lab to study phenotype switching in patient-derived xenograft models. As second model we used the mouse liver, for which we generated a single-cell multi-ome atlas and inferred the GRNs underlying each cell type, including the differences between pericentral and periportal hepatocytes (i.e. zonation). Our regulatory models, combined with in vivo validation experiments including high-throughput enhancer reporter assays, revealed the core TFs for hepatocytes and new TFs that control zonation via repression. A third model system we used is the fruit fly Drosophila melanogaster, which allowed us to study enhancer logic in vivo, again during phenotype switching in a tumour model and in a wounding model (switching to a senescent state).
We developed several new technologies, including a custom droplet microfluidics technique called HyDrop, to perform scATAC-seq and scRNA-seq at a cost that is 100x cheaper compared to commercial methods. We then benchmarked HyDrop scATAC-seq in a large international consortium. We also optimized massively parallel enhancer reporter assays and used them to measure the activity of melanoma enhancers in different states and of hepatocyte enhancers in vivo in the mouse liver.
Importantly, we developed multiple computational methods including a Python version of SCENIC, called pySCENIC; NextFlow pipelines to facilitate the use of SCENIC; a topic modeling approach, called cisTopic, to analyze scATAC-seq data; a single-cell viewer called SCope; a method to integrate scATAC-seq and scRNA-seq data on a virtual spatial template, called scoMAP; and finally SCENIC+ to infer enhancer-GRNs from sc-multiome data.
Exploitation and dissemination: our single-cell and spatial data sets are all publicly available as resources, both as raw and processed data (via GEO and SRA), and as online atlases in SCope. Our HyDrop protocols are all publicly available at protocols.io and are used by dozens of labs around the world. All our methods are available via GitHub where we also solved many dozens of issues to help users use our tools, and to improve them.
With the above described results, including the new HyDrop method, and the multiple computational methods such as cisTopic and SCENIC+, we went well beyond the state of the art. In addition, we developed multiple deep learning models, particularly convolutional neural networks, that are trained to predict chromatin accessibility from the sequence of a regulatory region. We developed DeepMEL and DeepMEL2 for melanoma, which can predict enhancer activity in different melanoma cell states (MEL and MES). We used these models to gain novel insight into enhancer function: we applied them across 6 species to study conservation and divergence of melanoma enhancers; we applied them to identify non-coding mutations in melanoma genomes (we sequenced 10 whole genomes with linked-read sequencing); and we applied them to generate synthetic enhancers. Next we trained a similar CNN, called DeepLiver, on the liver scATAC-seq and MPRA data, leading to new insights into the liver enhancer code, including how Tbx3 and Tcf7l1 act as repressors to establish zonated regulatory networks.
As final outcome of cis-CONTROL, we developed three strategies for the AI-driven design of synthetic enhancers. We evaluated the synthetic enhancers in human melanoma cell lines, and in vivo in Drosophila. The design process led again to a further understanding of enhancer logic, including the role of repressors versus activators, to balance enhancer activity. We achieved these results by the intricate combination of computational modeling and experimental testing.
Exploitation and dissemination: Publications were released on bioRxiv at the time of submission. All deep learning models have been made available via kipoi.org a resource to share trained deep learning models. Our results led to interactions with Pharma and Biotech industry, and to a Proof of Concept Grant, to further develop our synthetic enhancer design pipeline towards the design of enhancers for gene therapy.
Overview of key achievements
My booklet 0 0