Periodic Reporting for period 4 - SystGeneEdit (Dissecting quantitative traits and their underlying genetic interactions via systematic genome editing)
Reporting period: 2022-05-01 to 2022-10-31
The success of this project depended on pioneering several novel technologies. CRISPR editing with donor DNA is naturally inefficient in all organisms. To overcome this challenge, we developed new methods to enhance the efficiency of repair by recruiting donor DNA to the cut sites. To investigate the efficiency and fidelity of this system across the genome, we developed new whole-genome sequencing methods and analysis workflows. While most of the genome was edited precisely and efficiently with donor recruitment, some genomic regions were still prone to alternative, undesired edit outcomes. We developed computational models to predict the likelihood of these aberrant editing events across the genome based on sequence features. In addition, we developed methods to characterize more complex phenotypes, including a novel microscopy method that captures and analyzes over 10,000 microscopic images per second, and therefore scales with the complexity of natural and disease-associated genetic variation. Together, these technical achievements enabled investigating multi-modal effects of genetic variants across the entire genome.
Conclusion of the Action: The biological insights and technologies developed during this project have implications for eukaryotes far beyond the yeast model system. For example, our findings that multiple (not single) causal variants underpin many traits have important implications for genetic mapping studies in humans, where it is assumed that the majority of the detected genomic regions harbor a single causative variant. We also found that variants in the yeast system are active only under specific environmental conditions; translated to the human system, this means that the lifestyle of a person could be modified based on their genetic background to avoid a disease. Ultimately, large-scale unbiased detection of functional SNPs will provide important datasets to train and validate computational approaches to predict the impact of an individual’s genetic makeup on health and disease. Our computational models for difficult-to-edit regions should be applicable across organisms and help researchers improve the accuracy and safety of genome editing. Finally, our new gene editing and phenotypic screening technologies give researchers powerful new tools to explore how genetic variation impacts diverse cellular functions.
2. We developed methods to investigate the efficiency and fidelity of precision editing, including a fast, simple, cheap, and scalable method that produces sequencing-ready libraries directly from yeast cultures (Vonesch et al. 2021). We performed genome sequencing of thousands of edited strains and confirmed that the vast majority of clones carried the desired variants without off-target effects. We found that the likelihood of erroneous on-target structural variants was dependent on genome context, and constructed machine-learning models to predict these “hard-to-edit” regions, facilitating the development of improved editing methods (Li et al., in preparation).
3. We constructed variant pools and isolated thousands of strains, each with a different variant. We optimized phenotyping and analytic pipelines towards maximizing sensitivity for subtle effects. We profiled fitness across chemical, drug and nutritional perturbations, to investigate how variants are active under different environmental conditions. We found that most genes have not been linked to growth in these conditions in previous knockout screens. Their protein products show physical interactions with those of other genes implicated in these traits (Vonesch et al., in preparation). Many natural variants show genetic interactions with genes implied in the same trait by the KO screen. We used an improved MAGESTIC version to dissect QTL down to their causative nucleotide variants (Roy et al., in preparation).
4. Many SNPs only have an effect in the presence of another SNP. To enable systematic discovery of such genetic interactions, we developed a new method that allows single cells to obtain multiple edits. A key achievement was the development of a barcoding system capable of linking the barcodes from multiple rounds of editing, allowing the edit combinations to be read out by short-read sequencing (Roy et al., manuscript in preparation).
5. We introduced several novel methods that allow characterization of 1,000s of CRISPR perturbation effects within a single experiment. One of these methods, termed image-enabled cell sorting (Schraivogel et al. 2022), allows isolation of cells according to information from microscopic images at speeds up to 15,000 cells per second. This method will enable new types of experimental strategies and is compatible across organisms, from yeast to mammalian cells. The other method is a targeted, single-cell RNA-seq assay for the massively parallel molecular phenotyping of cells carrying genetic perturbations. We obtained rich molecular phenotyping information on the expression of ~200 genes across ~1,000 genetic perturbations (Schraivogel, Gschwind et al. 2020), and are now using it to assign functions to disease associated genetic variants (ongoing work).
- MAGESTIC 2.0 enabled us to generate high-quality natural variant libraries where > 95 % of strains contain the correct changes in genome sequence.
- We phenotyped libraries to assign functions to natural variants and dissected potential interactions between variants.
- We have developed a simplified whole-genome sequencing (WGS) library preparation workflow that skips the traditional genomic DNA isolation step.
- By sequence-validating thousands of edited strains, we identified identified and can now predict hard to edit regions of the genome.
- We implemented assays to query the effect of thousands of CRISPR perturbations on gene expression of microscopic phenotypes in single cells.