CORDIS - EU research results

Investigating differentiation using parallel single cell transcriptomic and epigenomic analysis

Periodic Reporting for period 1 - SC-EpiTranscriptome (Investigating differentiation using parallel single cell transcriptomic and epigenomic analysis)

Reporting period: 2019-06-01 to 2021-05-31

Nearly every cell in our body contains the same genetic information in the form of DNA. Yet, to fulfil its tissue specific roles and to adapt to changing environments, each cell needs a specific protein composition. A crucial step therein is the selective transcription of coding and non-coding regions of the genome. Transcription is induced by the binding of transcription factors to specific gene cis-regulatory sequences, which stabilizes in turn the binding and processivity of the core transcriptional machinery. Interestingly, the same transcription factors were shown to bind to and activate different target genes in different cell types (Arvey et al. 2012), and the presence of transcription factor binding sites at cis-regulatory sequences, have been insufficient to predict the actual presence of the transcription factor. Epigenetic mechanisms including histone modifications are suggested to explain this discrepancy. Different chromatin states are defined by the combination of these histone modifications that are thought to contain instructive information like a “histone code”.
Recent studies compared the genome wide histone code between undifferentiated and a variety of differentiated cell types, generating a very detailed picture of their distribution in different cell types (Zhu et al. 2013). They observe clear cell-type specific patterns that allow them to distinguish different cell types. With both the transcriptional activity as well as the distribution of histone modifications changing in the process of differentiation the question stays which changes first and potentially regulates the other.
Understanding the instructive potential of epigenetic pathways was shown to be important for two medical fields. In the field of cancer, a whole genome sequencing studies of human cancer samples revealed mutations in epigenetic pathways in half of the patients, opening up the possibility for reversible tumor supporting cell states (Flavahan et al. 2017). Additional in regenerative medicine researchers identified that removal of de-differentiation inhibiting Chromatin marks can increase the otherwise low efficiency induced pluripotent stem cell generation (Watanbe et al. 2013).
Therefore, the main objective of this work was to use genome wide co-acquisition of transcriptional and epigenetic changes in single cells, to provide the first systematic description of the coordinated changes in transcription and epigenetic landscape during differentiation. Based on these experiments, we aimed to identify changes with a potentially instructive role of histone modifications and to proof this instructive nature through knockout experiments.
First, we set up the method to recover both mRNA and histone mark position in single cells at high sensitivity. After establishing protein-A Micrococoal nuclease-based approach to study the distribution of histone modifications in single cells before the start of the fellowship that we termed sortChIC (due to its FACS compatibility), we first optimised conditions for efficient recovery of the cells RNA. Classical approaches using PFA proofed to be problematic for Chromatin accessibility in repressive regions, while the nuclear preparation used by most alternative single cell chromatin methods lead to a nearly complete loss of cytoplasmic RNA. We set up an alcohol-based fixation approach, that had neglectable effect on accessibility while keeping most of the mRNA. To further increase the transcriptome complexity, we used an RNA sequencing approach that recovers full length and also non-polyadenylated transcripts. Specificity as well as sensitivity of the approach were further proven in cell line experiments and a computational approach was established to remove potential mis-annotations of reads from transcription to chromatin.
Adaptation to different antibodies was performed by selection and titration of ChIP validated antibodies.
Aiming to identify suitable enrichment markers for the enrichment of bone marrow differentiation intermediates, we looked at previously generated high resolution sortChIC data for H3K4me1, H3K4me3. Despite the enrichment of immature cells in these experiments, cell type transitions were found to be rather discrete, not allowing the observation of differentiation intermediates. We therefore switched the biological model to one with an even higher fraction of differentiating cells. Specifically, we looked at in vitro gastrulation in a system called Gastroloids, that starts with the aggregation of around 200 ES cells and undergoes over a timeframe of 7 days Apical/basal axis formation, gastrulation, tissue formation and somatogenesis. In this fast-developing system, we looked at the distribution of H3K4me3 and H3K27me3 in around 2000 cells from each day and histone mark from day3 to day7 (FigA). Analysing this complex dataset, we identified 4 major differentiation trajectories with many cell type intermediates. Analysing the changes of relative changes in chromatin and transcription identified numerous genes where the switch from H3K27me3 silent to actively transcribed state could be followed, including intermediate stages that already missed H3K27me, yet did not start expressing the gene yet (example in FigB). With this we generated the first high resolution Chromatin and transcript method in single cells that enables the recovery of full-length cytoplasmic RNA and generated a detailed dataset of the early steps of in vitro gastrulation.
Following steps aim to finalise the analysis of transient chromatin states during gene activation and silencing. As the gastroloid protocol is in vitro, we aim to use the well-established inhibitor for the H3K27 methyl transferase Ezh2 GSK126 to proof the consequence of transient removal of this pathway during gene silencing and activation.
The knowledge generated in this project should not only allow a better understanding of the role of H3K27me3 in gene regulation but can also serve as a blueprint for the systematic analysis of the role of any DNA binding protein in gene regulation and differentiation. The common transcript component, that is unaffected by the chromatin target chosen can thereby additionally serve as an intermediate for data alignment and comparison. It is thereby a valuable large step in experimental possibilities that we try to make accessible for as many researchers as possible through an in-house facility as well as collaborations with industry.
The ability to detect as much as possible of the underlying biology is especially crucial in the context of diagnostics, where the cell number obtainable is sparse and the knowledge gained should be as comprehensive as possible to guarantee the most accurate diagnosis and selection of most favourable treatment.
Fig1 Overview