Community Research and Development Information Service - CORDIS


Aggregation selection Report Summary

Project ID: 653963
Funded under: H2020-EU.1.3.2.

Periodic Reporting for period 1 - Aggregation selection (Genome-wide screen of aggregation selection)

Reporting period: 2015-08-01 to 2017-07-31

Summary of the context and overall objectives of the project

In order to carry out their biological function, proteins must fold into a unique native state. However, the failure of a polypeptide to acquire or maintain this native state can result in protein aggregation, caused by the interaction of solvent-exposed hydrophobic stretches. The study of this event has grown into a dynamic scientific field, mainly due to its association with numerous human diseases such as Alzheimer’s and Parkinson’s disease. Moreover, aggregation is one of the most critical problems in recombinant protein expression in experimental investigations and large-scale protein production. In contrast, in addition to playing a role in pathogenicity, numerous examples exist of cells exploiting protein aggregation for crucial functional purposes, such as scaffolding melanin in the skin or storing hormones intracellularly.

To investigate the molecular events underlying the intracellular aggregation process and aggregation-driven toxicity, several model systems have been developed. Simple unicellular organisms, such as bacteria or yeast, are used as models for studying protein deposition inside the cell, as they have simple growth requirements and an extensive background knowledge base (e.g. well annotated genomes sequences). Genome-wide screens in these organisms identified processes that maintain proteome stability and promote folding and clearance (e.g. chaperones, proteasome subunits, stress- induced transcriptional regulators).

The project that I am presenting aspires to quantitatively identify modifiers of the protein aggregation effect on cell fitness by employing a genetic screening strategy in S. cerevisiae (Figure 1). Additionally, our model can distinguish and quantify how much the fitness effect is governed through foci formation, or specific loss- and gain-of-function.

An understanding of these determinants is essential in order to i) unravel how the cell regulates the aggregation process; ii) develop new strategies for tackling the debilitating pathologies associated with protein aggregation; and to iii) increase the solubility and functionality of recombinant proteins.


- We identified a number of genes that are able to modulate the aggregation process and are significantly different dependent on whether the aggregated protein is essential, non-essential or toxic.
- The methodology of the screen has been validated as some of the genes or a similar functional class has been reported in previous studies.
- Using a systems-biolog approach, we could not identify a significant difference in either a) ontology or b) pathway enrichment in the different conditions tested.

Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

Creation of strains:
We used the system introduced in De Groot et. al, 2017. To summarize, we used two vectors (pMA vector) containinig i) URA3 and GPF (URAsol) or ii) URA3, GFP and Abeta42 (URAagg) under control of the inducible GAL1 promoter. This fragment is flanked by 65bp regions homologues to TRP1 allowing integration into the Y7039 strain (MAT-alpha can1D ::STE2pr-LEU2 lyp1D ura3D0 leu2D0 his3D 1 met15D0).

Creation of the knock-out libraries:
We used the KO library BY4741, having around 5200 KO strains. Every strain is characterized by two 20 nucleotide tags (uptag an downtag) that are used to determine which strains have a growth advantage over other strains in the pool.
Using a RoToR robot (Singer Instruments) we will perform a synthetic genetic array to cross the genome-wide deletion library with the two strains described above to obtain two new collections of yeast cells expressing either URAsol or URAagg.

In all the assays, the strains were grown in SD -HIS media containing a mixture of sugars and amino acids. Based on De Groot et. al, 2017, we used 1% of galactose for induction.
Single colonies were picked and grown overnight in 2% Glucose. This culture was employed to inoculate SD -HIS 2% Raffinose and growth during six hours. Then it was inoculated in fresh media with 2% Raffinose and 1% Galactose.
SD -HIS -Ura was employed to test the essentiality of Ura3p. SD -HIS with uracil and 5FOA (Zymo Research) was used to analyse the aggregation effects of a toxic Ura3p activity.

Images were acquired with a Zeiss710 (Carl Zeiss) using an 63x objective, an excitation laser of 488 nm and emission window between 581 nm and 750 nm.
We confirm that URAsol remains soluble, whereas URAagg accumulates into intracellular foci .

Competition Experiment:
The competition experiment will consist of two cultures containing two different combinations of the above-mentioned strains: (i) one will include the strains expressing URAsol and (ii) the other one the strains expressing URAagg.
These cultures will be grown during three days (approximately 25 generations) at exponential phase in a biofermentor. I will take samples at four different time points (0h, 24h, 48h and 72h) from which I will extract the genomic DNA using a LiOAc-SDS buffer.

Sequencing samples:
Using primers that bind to the common regions of all gene disruptions (U1 and D1), we PCR amplify the molecular barcodes, allowing us to distinguish the different strains within each pool. Moreover, an additional set of 8 nucleotide multiplex indices is added within the primers to differentiate the different samples (i.e. different timepoint, different media (minURA, plusURA, 5FOA), different construct (SOL vs AGG)).

Computation Pipeline:
After sequencing the above described samples, we developed a computational pipeline to:

- Trim the reads based on quality and removal of adaptors
- Trim 5' and 3' multiplex indices and assign to the sample using cutadapt
- Map to KO library using vsearch
- Build the counts table
- Merge the experimental metadata (media, timepoint, construct)
- Normalize the counts by the starting counts to relate values in a replicate to eachother
- Perform robust regression to obtain growth estimates

This pipeline has been build using Common Workflow Language (CWL) and Docker (for containerization) promoting reliable and reproducible research.
Using this computational pipeline, we could identify a number of genes that are able to modulate the aggregation process.

Functional Enrichment Analysis:
For the functional enrichment analysis, we use the GO data and KEGG data. For the statistical test, we made use of the Fischer and the Wilcox test. We could not identify a significant difference.

Exploitation and dissemination:

- The developed pipeline will be available on Github and can help other researchers with their analysis of amplicon data

- The obtained data and the results from this study will be published.

Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

We identified regulators that affect the impact of protein aggregation on cell fitness. Therefore, this screen can assist in the design of yeast strains for improved protein expression, where protein aggregation remains an important bottleneck. Moreover, the identified regulators can prove relevant for biomedicine to treat protein-misfolding diseases. The identified regulators can assist in the the search for proteostatic regulators useful for controlling neurodegenerative diseases.

Related information

Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top