Learning and modeling the molecular response of single cells to drug perturbations

Projektinformationen

DeepCell

ID Finanzhilfevereinbarung: 101054957

DOI

10.3030/101054957

EK-Unterschriftsdatum 28 November 2022

Startdatum 1 Januar 2023

Enddatum 31 Dezember 2027

Finanziert unter

European Research Council (ERC)

Gesamtkosten

€ 2 497 298,00

EU-Beitrag

€ 2 497 298,00

2 497 298,00

Koordiniert durch

HELMHOLTZ ZENTRUM MUENCHEN DEUTSCHES FORSCHUNGSZENTRUM FUER GESUNDHEIT UND UMWELT GMBH
Germany

Periodic Reporting for period 1 - DeepCell (Learning and modeling the molecular response of single cells to drug perturbations)

Berichtszeitraum: 2023-01-01 bis 2025-06-30

Recent advances in single-cell genomics (SCG) have enabled unprecedented insights into a cell’s molecular state, including how it responds to perturbations. However, current SCG approaches primarily rely on descriptive statistics, limiting their predictive power in modeling cellular behavior under different conditions. A key challenge remains: how to systematically predict a cell’s internal state across all possible perturbations, particularly drug-induced changes. DeepCell aims to address this gap by integrating multi-omics single-cell readouts with machine learning, allowing for the accurate modeling of drug responses and the optimization of treatment strategies across diverse cell types.

Building on a pilot study that successfully predicted gene expression changes in response to stimuli, DeepCell extends this approach by developing a multi-condition, multi-modal deep-learning framework for both normal and spatially resolved genomic data. Unlike classical small-scale systems biology models, the DeepCell model introduces greater flexibility, enabling the interrogation of complex drug interactions and the characterization of gene regulatory landscapes through deep network interpretation.

DeepCell provides a unique opportunity to leverage cell-based drug screens for fundamental questions in gene regulation and treatment outcome prediction. As a proof of concept, the project focuses on identifying key regulators of enteroendocrine lineage selection in the intestine. To achieve this, we have designed a 500-compound single-cell organoid RNA-seq screen, incorporating compounds from a spatial imaging screen of 200,000 intestinal organoids. These data will be modeled using DeepCell to predict optimal treatment strategies for obese mice, laying the groundwork for future in silico drug screening. This approach has the potential to accelerate drug discovery and transform clinical decision-making by enabling rapid computational predictions of drug effects.

To formalize the challenge of predicting transcriptomic responses to small molecules, DeepCell introduces a comprehensive benchmark that provides a standardized framework—including data, models, and evaluation metrics—to systematically assess machine learning methods for drug response prediction. This benchmark is based on a high-quality single-cell dataset profiling 146 chemical compounds in peripheral blood mononuclear cells (PBMCs) from three donors, capturing transcriptomic signatures before and after drug exposure. In an open competition with over 1,300 participants, cutting-edge machine learning models—including neural networks and transformer-based approaches—demonstrated the ability to accurately predict drug-induced gene expression changes in unseen conditions.

To complement these efforts, we developed Moscot, a scalable framework for mapping cellular states using optimal transport (OT). Given the destructive nature of single-cell sequencing and the limitations of capturing multiple modalities simultaneously, aligning distributions of cells efficiently is crucial. Moscot enables us to track hundreds of thousands of single cells across multiple time points in developing mouse models, leading to the discovery of the first epsilon-cell specific transcription factor in pancreatic development. Additionally, we applied moscot to align slices of spatial transcriptomics, an essential step for constructing a comprehensive tissue-level view by integrating gene expression, surface proteins, and chromatin accessibility. Furthermore, we introduced a novel method for spatiotemporal trajectory inference, allowing for the mapping of spatial transcriptomic data over time. These advancements open up new possibilities for modeling cellular state transitions, enhancing our ability to predict cell fate decisions and optimize therapeutic interventions.

By addressing the critical need for predictive modeling in single-cell genomics, DeepCell is set to transform drug discovery and precision medicine. The project’s innovative machine learning approaches, combined with high-throughput experimental validation, will accelerate the identification of effective drug treatments and provide a computational foundation for simulating complex biological processes. Through its integration of machine learning, multi-omics data, and large-scale perturbation screens, DeepCell establishes a scalable and interpretable framework that advances both fundamental biology and clinical applications.

1. One achievement of our project is the development and publication of our perturbation benchmark at NeurIPS 2024. The benchmark set a new standard for evaluating drug-induced transcriptomic changes. It included widespread community engagement and set the direction for further research in the predictive modeling of cellular responses to perturbagens.
2. We introduced CellRank 2, a scalable framework that analyzes multiview single-cell data to predict cellular fates and trajectories. It effectively identifies terminal states and fate probabilities, integrates data across time points, and estimates transcription and degradation rates, enhancing understanding of cellular dynamics in development. This work was published in Nature Methods (https://doi.org/10.1038/s41592-024-02303-9).
3. We developed an open-source Python framework for analyzing heterogeneous electronic health records (EHRs). It streamlines data extraction, quality control, and statistical analysis, supporting advanced applications like patient stratification and survival analysis, This work is published in Nature Medicine (https://doi.org/10.1038/s41591-024-03214-0).
4. We introduced scPoli, an open-world learner that integrates single-cell atlases by learning representations to handle heterogeneous data. It supports data integration, label transfer, and reference mapping, effectively managing sample variations and enhancing atlas utility for biological insights. This work is published in Nature Methods (https://doi.org/10.1038/s41592-023-02035-2).
5. We developed an experimental pipeline that uses combinatorial indexing and automation technology for massively parallelized scRNA-seq profiling of post-perturbation organoids.

The DeepCell project has advanced single-cell genomics by integrating machine learning with high-throughput scRNA-seq to model drug responses. A benchmarking platform was developed to evaluate perturbation predictions, alongside CellFlow, a scalable framework that leverages advanced machine learning for predicting cellular responses. Organoid characterization was enhanced through a Human Endoderm-derived atlas and cscANVI, a cancer data integration model for identifying malignant cell states. The implementation of a “lab in a loop” approach has accelerated the feedback between computational predictions and experimental validation, improving model accuracy for drug response modeling, cell fate engineering, and personalized medicine.

Periodic Reporting for period 1 - DeepCell (Learning and modeling the molecular response of single cells to drug perturbations)

Diese Seite teilen Diese Seite in sozialen Netzwerken teilen

Herunterladen Den Inhalt der Seite herunterladen