Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Algorithms and experimental tools for integrating very large-scale single cell genomics data

Periodic Reporting for period 4 - scAssembly (Algorithms and experimental tools for integrating very large-scale single cell genomics data)

Reporting period: 2022-06-01 to 2023-05-31

Much of the biology we know, and in particular our own human biology is achieved by the collective and coordinated behavior of individual cells. This is true when studying developing embryos, healthy tissues, and become even more apparent when looking at different diseases. Major advances in the biological sciences during the last 70 years provided us with in-depth understanding of some of the most important mechanisms allowing cells to achieve diverse function, including the remarkable flexibility allowing cells to control which genes will be active at which tissue and in response to multiple environmental queues or inter-cellular communication. The genomics revolution of the last 30 years complemented these discoveries with the possibility to fully read the genes present in each individual organism or every human patient. But until recently it was difficult to link the detailed and fine-tuned behavior of cells with the rich genomic instruction set and devise practical model that can decipher biological function from the joint activities of millions of single cells.

This gap made effort for translating advances in biology toward medical impact difficult – most common diseases are not emerging due to one faulty gene or some global malfunction across all the cells in an individual. Instead, diseases are emerging when some of our cells develop abnormal behavior, either internally, or through mis-communication with other cells. The emerging technology of single cell genomics is revolutionizing our ability to address precisely these types of problems: we can profile genomes or gene activity profiles across thousands of cells from normal or diseased tissue, understand which cells are malfunctioning by comparing them to newly established atlases of cellular behaviors, and examine the impact of different therapies at high resolution and without assuming all cells respond similarly or collectively. The objectives of the scAssembly project is to develop new computational methodology for making sense of single cell genomics experiments in health in disease.
To meet scAssembly goals we developed methodologies for modelling cellular states and dynamics from single cell RNA-seq data. Our overall approach is aiming at quantitative models for the way by which cells regulate genes within tissues (static maps), such that we can use additional experimental layers to infer the dynamics of cells over such maps, their interaction with each other, and the gene regulatory mechanisms that drive such dynamics and interaction.

Our research in scAssembly provided significant advances with these challenges. scAssembly thereby contributed to the dramatic improvement in single cell genomics and its impact, as observed in almost all fields of biology. Highlights of our key contributions include:

1. We developed algorithms for partitioning scRNA-seq datasets into metacells: disjoint and homogenous groups of profiles that could have been resampled from the same cell. Unlike clustering analysis, our algorithm specializes at obtaining granular as opposed to maximal groups (Baran et al 2019). We worked on highly scalable implementation of Metacell, capable of handling many millions of profiles (Ben-kiki etal, in 2022).

2. Using metacells to represent cell atlases quantitatively, we developed algorithms that allow interpretation of new data (“query”) as projections over an existing atlas (Ben-Kiki et al 2023).

3. We showed how to model dynamics over quantitative transcriptional maps. When applying this to models of cancer immunotherapy, we described how T-cells killing potential is declining as part of a complex dynamics of proliferation, differentiation and turnover (Li et al 2019, Barboy et al 2023 in revision).

4. Since single cells are always functioning in the context of tissues and within ensembles of thousands of other cells, we develop strategies for understanding dynamics of groups of cells. This was first used to link variation of transcriptional states in single cell populations and the emergent properties of cancer cell populations, in particular given epigenetic deterioration (Meir et al 2020). Even more ambitiously, we developed models describing embryonic development as the collective dynamics of cells over a metacell model (Mittnenzweig et al. 2021). Our new “differentiation flow” models became the basis for all our work on embryonic development.

5. We showcased how to use the new tools in order to understand the function of epigenetic regulation during embryonic development. This was approached using in-vivo analysis of embryos lacking a functional de-methylation machinery (Cheng et al 2022), through analysis of chromosomal conformation in embryos (Rappaport et al 2023), and by developing meso-endo embryoid models and characterizing the impact of single, double and triple de-novo methyltransferase knockouts (Mukamel et al 2023).

These works provide us with tools that go significantly beyond the original aims set for scAssembly originally. It provides us with opportunities for translating the single cell genomics technology revolution into quantitative, robust and highly interpretable models and discovery platforms. And it shows how to apply new tools to address some of the most fascinating questions in modern biology – from the harmonious emergence of tissues and organs in mammalian embryos, to the deterioration of homeostasis and emergence of disease in ageing tissues.
The conclusion of scAssembly set the stage for several major emerging projects we will pursue next, focusing on two major endpoints: fully spatio-temporal models for mammalian embryonic development, and studies linking single cell states and the physiology of blood ageing.
gastrulation-temporal.png
My booklet 0 0