Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-05-28

Genetic mapping of complex trait intermediates

Final Report Summary - INTGENMAP (Genetic mapping of complex trait intermediates)

Many important traits are heritable, and have a strong genetic component. In simple cases, such as Mendelian diseases, the genetic cause can be found with linkage methods, and many trait genes have been mapped to date. More recently, association mapping studies have focused on complex traits that include prevalent human diseases, such as type 2 diabetes, hypertension, and others. Numerous genome-wide association studies have corroborated that no single gene explains all or even a large part of the heritable variability in such traits, and that individual effect sizes due to common variants are small (Manolio et al., 2009). The effect of a single locus genotype on a global trait has to be mediated by cellular, tissue, and organ phenotypes. Thus, genetics of cellular traits is central to developing an understanding of the genetic basis of complex traits. Studies in model organisms, where causality can be addressed by reverse genetic tools, are required for understanding the mechanisms behind such traits in order to target them with pharmaceuticals. This project aims to map the genetic basis of intermediate cellular traits (mRNA levels, protein levels, protein localisation, metabolite levels) that influence fitness in the model organism Saccharomyces cerevisiae.

First, we systematically analyzed levels of 4,084 GFP-tagged yeast proteins in the progeny of a cross between a laboratory and a wild strain at single-cell resolution using flow cytometry and high-content microscopy to understand the genetic basis of protein levels, the most important intermediate trait. The genotype of trans variants contributed little to protein level variation between individual cells, but explained over 50% of the variance in the population average protein abundance for half of the GFP-fusions tested. To map trans-acting factors responsible for the heritable expression variation, we performed flow sorting and bulk segregant analysis of twenty-five proteins, finding a median of five protein quantitative trait loci (pQTLs) per GFP-fusion. In our mapping analysis, we find that cis-acting variants predominate; the genotype of a gene and its surrounding region had a large effect on protein level six times more frequently than the rest of the genome combined. We further found evidence for both shared and independent genetic control of transcript and protein abundance: over half of the expression QTLs (eQTLs) contribute to changes in protein levels of regulated genes, but several pQTLs do not affect their cognate transcript levels. Allele replacements of genes known to underlie trans eQTL hotspots confirmed correlation of effects on mRNA and protein levels. These results represent the first genome-scale measurement of genetic contribution to protein levels in single cells and populations, identify over a hundred trans pQTLs, and validate the propagation of effects associated with transcript variation to protein abundance.

In collaboration with the Stanford Genome Technology Center, we established assays for medium-throughput perturbation of gene activity using the CRISPR/Cas9 system. We designed libraries of genetic perturbations that target known complex trait genes, as well as broad sets of functionally related genes, such as transcription factors, kinases, and drug pumps. We then aimed to assess the rules for designing effective guide RNAs (gRNAs) for the CRISPR/Cas9 system in yeast. To do so, we created an inducible single plasmid CRISPR interference system for gene repression, and used it to analyze fitness effects of gRNAs under 18 small molecule treatments. The idea here is that effective repression of a drug resistance gene will reduce the growth rate under drug condition. By testing many repression constructs in parallel in a single experiment, and assessing their growth rates by quantitative barcode sequencing, we could rapidly identify the repressors that work well. Indeed, we correctly identified previously described chemical-genetic interactions as positive controls, as well as a new mechanism of suppressing fluconazole toxicity by repression of the ERG25 gene. Further, we established guidelines for future guide RNA designs. gRNAs that target regions with low nucleosome occupancy and high chromatin accessibility were more effective. We also found the best region to target gRNAs was between the transcription start site (TSS) and 200bp upstream of the TSS. Finally, unlike nuclease-proficient Cas9 in human cells, point mutations were tolerated equally well by truncated (18 nt specificity sequence) and full length (20 nt) gRNAs, however, 18 nt gRNAs generally exhibited less potent effects than full length gRNAs. These results establish a powerful functional genomics screening method, provide rules for designing effective gRNAs for gene repression, and show that 18 nt and 20 nt gRNAs exhibit similar tolerance to mismatches in the target sequence. The findings will enable effective library design and genome-wide screening in many genetic backgrounds. Future work by the working groups in Europe will now apply the gained insights to human cells to identify perturbations that can easily change disease phenotypes on the cellular level.

The Fellow continues his career as Faculty at the Wellcome Trust Sanger Institute in the UK. He will follow these directions with work on human cells to inform us whether it is feasible to directly change a complex trait (e.g. diabetes or Alzheimer’s susceptibility) using simple perturbations, whether it is more fruitful to target individual causal loci, or intermediate pathways, and how to best select the candidates for targeting with small molecules. The finished project has informed how to focus the work on these efforts to help relieve the complex disease burden in Europe.