Periodic Reporting for period 1 - GReCS (Characterizing gene regulation in single cells through integration of scRNA-seq and scATAC-seq data with generic multi-modal prior information)
Periodo di rendicontazione: 2021-11-15 al 2023-11-14
As one part of the project, a reference dataset of transcription factor regulation had to be assembled using an extensive list of published datasets. A second part involved developing a method to identify relevant connections in a given network by integrating different modalities and cell type specific input data. Finally, single cell RNA and ATAC data were to be jointly used to infer gene regulatory networks in a new data analysis application. Overall, all objectives could be fulfilled, with the exception of minor changes due to new scientific developments since the start of the action.
As a new way to prioritise connections in a reference network, an approach based on network propagation has been developed. Heterogeneous, cell type specific data is passed through the network and significantly enriched nodes are identified. These could represent active transcription factors that drive gene expression and chromatin accessibility in a given cell type. To increase the scope of the problem, genomic scores derived from GWAS summary statistics have been tested as another type of input data. Mutations in complex diseases often affect enhancers. Therefore, when mapping summary statistics based scores onto the network and integrating it with cell type specific data, TFs which are likely to be functionally affected by those mutations can therefore be identified in a cell type specific manner. This approach has been developed and applied as part of a third, data analysis based line-of-work described next.
A new dataset of first trimester human skeletal development has been extensively analysed with a focus on gene regulation along cell differentiation trajectories and across anatomic locations. The dataset consists of more than 300k cell nuclei profiled with both RNA and ATAC sequencing and spans various time points between 5-11 pcw across 5 locations. Leveraging both modalities, enhancer-GRNs have been predicted for developmental trajectories including osteogenesis and chondrogenesis. Changes of TF activity were analysed and effects of TF perturbations were predicted, in particular for mutations known to cause craniosynostosis, a genetic condition with premature fusion of bone plates in the skull. In addition, GWAS summary statistics for hip osteoarthritis have been integrated with single cell data in a newly developed approach. Interestingly, enrichments of TFs involved in bone formation across osteogenic cell types were observed, pointing to a role in hip shape formation, which when altered may lead to disease later in life.
The results generated as part of this project were planned to be published in one or more scientific papers. Changes in the project plan due to new scientific developments and an increased scope of the work led to delays in the dissemination of the results. However, a manuscript covering the application of GRN inference to skeletal development is currently under revision, as well as a second manuscript combining GWAS summary statistics with single cell data. Finally, a manuscript about CellRegulon DB, a database of cell type specific reference-GRNs is going to be submitted soon. The project results have been and will be further disseminated in additional scientific meetings and through planned outreach activities.
In an analysis of a dataset of human early skeletal development that will be part of the HCA gene regulatory network analysis using RNA and ATAC data has been extensively applied and led to various new results. Single nuclei profiling and the analysis of a large number of samples of the developing skeleton allowed us to describe differentiation trajectories of bone cells, which is difficult due to the matrix-rich environment of those cells. Further, first trimester human skeletal development of the calvaria has been analysed for the first time with single cell resolution. Predictions of enhancer-GRNs have been made for numerous cell types throughout the atlas, making it a multimodal reference that can be used by researchers for further exploration. In addition, several new cell states have been characterised together with driving TFs, including the role of different TFs in diseases like craniosynostosis and in bone formation that may lead to osteoarthritis. A new computational approach that has been proposed in this context can be used to identify TFs that play cell type specific roles in traits for which GWAS summary statistics exist.
Overall, this work partially extends and will be partially included in the Human Cell Atlas, which is predicted to transform our understanding of biology and have a wide-reaching impact on future healthcare. The open availability of cell type resolved reference maps, will aid researchers world-wide in their mission to advance basic science, as well as companies in translating research into new drugs and medical innovations.