CORDIS - Forschungsergebnisse der EU
CORDIS

Laying the Biological, Computational and Architectural Foundations for Human Cell Lineage Discovery

Periodic Reporting for period 4 - LineageDiscovery (Laying the Biological, Computational and Architectural Foundations for Human Cell Lineage Discovery)

Berichtszeitraum: 2020-03-01 bis 2021-01-31

What is the problem/issue being addressed?
Our LineageDiscovery study is addressing the fundamental open questions in human biology and medicine, questions that relate to the human cell lineage tree. Our study investigates the human cell lineage structure, dynamics and variance in development, adulthood and ageing, during disease progression, and in response to therapy.

Why is it important for society?
Despite decades of research, we do not know yet how a cancer metastasizes, rendering it lethal. Hypotheses abound, but a definite answer has not emerged, in part due to lack of adequate methods. A human cell lineage tree that captures cancer history, heterogeneity and topography at cellular resolution has never before been attempted and it promises ground-breaking insights into cancer biology and therapy. New knowledge of the scale provided by the human cell lineage analysis is aiding our understanding in developmental biology, the landscape of immune system maturation, and stem cells dynamics. For example, which cancer cells give rise to metastases? Is relapse after chemotherapy caused by ordinary tumor cells escaping chemotherapy stochastically, or by cancer stem cells that escape chemotherapy due to slow division rate? Do beta cells/oocytes/neurons/heart muscle cells renew during adulthood? Moreover, unraveling the dynamics of diseased cells, which depend on the specific cellular microenvironment and stochastic events, through their cell lineage tree can help in selecting the appropriate treatment, thus facilitating the advancement of personalized medicine.

What are the overall objectives?
The overall objective of Lineage Discovery is laying the biological, computational and architectural foundations for a large-scale human cell lineage discovery project, and to establish its feasibility and value via collaborative proof-of-concept cell lineage discovery experiments built on these foundations.
Below are detailed technological achievements for each of the three objectives:

We have Developed a prototype efficient cell lineage discovery workflow.
The Duplex MIPs based cell lineage workflow is composed of (a) Design of duplex MIPs precursor: Desired targets are selected from our cell lineage database and precursors are designed; (b) Duplex MIPs preparation: duplex MIPs precursors are synthesized on microarray, collected and amplified by PCR as a pool. PCR product is digested to remove the universal adaptors (red and green); the digested product is purified and diluted to obtain active duplex MIPs; (c) Duplex MIPs and template DNA are mixed together, the targeting arms (blue and yellow) anneal to the flanking regions of the targets and the MIPs are then circularized by gap filling with DNA polymerase and ligase. Linear DNA, including excess MIPs and template DNA, is digested by exonucleases and an Illumina sequencing library is generated by adding adaptors and barcodes using PCR for each sample separately. Libraries are pooled and sequenced by Illumina NGS platform, followed by analysis of the raw reads to detect mutations. This mutation information is then used to infer the cell lineage tree (FIG. 1).
Our computational pipeline produce targeted single-cell (SC) sequencing data, and uses it to generate a lineage tree of the input cells. This pipeline consists of our error model, designed to address the noise that is caused in vitro by the polymorphic nature of STRs. The stutter error model is based on analyzing synthetic STR sequencing library to calibrate a Markov model for the prediction of stutter patterns in any amplification cycle. This biallelic STR signal is used as the input of genotyping algorithm, based on approximately –Maximum-likelihood approach to construct the lineage tree. Different statistical approaches, such as bootstrapping are also implemented along the way to evaluate the quality of the signal and validate the stability of the lineage tree.
An end-to-end automated system for the analysis of SC DNA, targeted for 14K MS loci has been implemented and deployed in the Weizmann servers farm (FIG. 5).

We presented eSTGt, a programming and simulation environment for population dynamics background. The language captures in broad terms the effect of the changing environment while abstracting away details on interaction among individuals. An eSTG program consists of a set of stochastic tree grammar transition rules that are context-free. When executing a program, the tool generates the corresponding lineage trees as well as the internal states values. The presented tool allows researchers to use existing biological knowledge in order to model the dynamics of a developmental process and analyze its behavior throughout the historical events. Simulated lineage trees can be used to validate various hypotheses in silico and to predict the behavior of dynamical systems under various conditions.

We Demonstrated the feasibility and value of human cell lineage discovery via collaborative proof-of-concept experiments.
Below is a list of few of our collaborators whose samples after processing and analysis yielded satisfactory results. In all the below figures, the colors represents different biological classifications that were classified by our collaborators, the width of the branches of the tree represents p-value which is a hypergeometric significance score, that was calculated for these different classifications, and the triangles represents the bootstrap that was calculated for the tree structure, representing its statistical stability.

• Christoph Klein: Experimental Medicine, University of Regensburg, Germany - Uncovering mechanisms of metastasis formation by lineage analysis of individual primary tumor and metastatic cells.
Fig.12: Lineage tree for malignant melanoma patient.
Fig.16: Lineage tree reconstruction for breast cancer tumor cells.

• Tsila Zuckerman, Hematology & Transplantation, Rambam, Haifa, Israel. Study Leukemia at diagnosis, remission and relapse, resistance to chemotherapy and relapse initiation.
FIG.19 20: Lineage tree reconstruction for Acute Myeloid Leukemia tumor cells.

• Ruth Halaban, Department of Dermatology, Yale U. New Haven, CT, USA , Department of Dermatology, Yale U. New Haven, CT, USA
FIG. 3: Cell lineage reconstruction of SCs from melanoma metastases and PBLs from the patient YUCLAT

• Ruby Shalom Feuerstein, Department of Genetics and Developmental Biology Rappaport Faculty of Medicine Technion - Israel Institute of Technology.
FIG. 21: Cell lineage tree for cornea cells
Our collaborations have demonstrated how our system can be used to investigate the progression of different cancers types in specific human patients, which can be further used to tailor the accurate treatment to each patient.
Having demonstrated the merits of the developed system, many open questions in biology can be answered, in cancer and developmental research.
An end-to-end automated system for the analysis of SC DNA, targeted for 14K MS loci has been impleme
fig3.jpg
Cell lineage tree for cornea cells.
Creation of a Next-Generation-Sequencing-based platform for cell lineage reconstruction and its appl
Cornea single cell lineage tree reconstructed by the Neighbor – Joining method
Lineage tree reconstruction for LCL465 Acute Myeloid Leukemia tumor cells
Lineage tree reconstruction for LCL440 Acute Myeloid Leukemia tumor cells
Lineage tree reconstruction for 795-09 breast cancer tumor cells
ineage tree for MM16-423 malignant melanoma patient.
Duplex MIPs-based cell lineage workflow