Periodic Reporting for period 3 - IdrSeq (Discovery and characterization of functional disordered regions and the genes involved in their regulation through next generation sequencing)
Reporting period: 2019-05-01 to 2020-10-31
The overall objectives of this proposal is to identify and characterize functional IDRs in cells, and to discover genes involved in their regulation using yeast as a cellular model. We proposed to develop and apply a targeted, high-throughput, multiplexed approach that we call IdrSeq (for Intrinsically disordered region Sequencing). This has been achieved now and published in an open access journal (Ravarani et al, MSB 2018). Specifically, using IdrSeq, we aim to discover and characterize IDRs that can
(Aim 1) function in transcriptional activation, and discover genes that modulate transcriptional activity
(Aim 2) influence protein stability, and discover genes involved in regulating half-life and
(Aim 3) form higher-order assemblies and discover genes that regulate assembly formation
The unique feature of this proposal is its integrative vision of synthetic & systems biology, (un)structural biology, cell biology, genetics, experiments and computation to establish a discovery platform to study IDRs in a cellular context. Since IdrSeq is modular and scalable, it can be readily extended to investigate a broad range of IDR functions, and adapted to other organisms. Elucidating the principles of sequence-function-gene relationship of IDRs holds enormous potential for synthetic biology. The discovery of genes that regulate IDR function has direct implications for human health by revealing novel therapeutic targets.
Since the beginning of the project, we have developed and presented IDR-Screen, a framework to discover functional IDRs in a high-throughput manner by simultaneously assaying large numbers of DNA sequences that code for short disordered sequences. Functionality-conferring patterns in their protein sequence have been inferred through statistical learning. Using yeast HSF1 transcription factor-based assay, we have discovered IDRs that function as transactivation domains (TADs) by screening a random sequence library and a designed library consisting of variants of 13 diverse TADs. Using machine learning, we have discovered that segments devoid of positively charged residues but with redundant short sequence patterns of negatively charged and aromatic residues are a generic feature for TAD functionality. We could use this rule to design new sequences with increased strength of transactivation. We also used this approach to discover the impact of polymorphisms seen in the natural population as well as cancer genomes of the human transcription activation domains. We anticipate that investigating defined sequence libraries using IDR-Screen for specific functions can facilitate discovering novel and functional regions of the disordered proteome as well as understand the impact of natural and disease variants in disordered segments.
The work and the dataset have been published as an Article in the open access journal Molecular Systems Biology (Ravarani et al, MSB 2018). The work was featured as a cover image with a news and views highlighting the importance of our work. The paper was also identified as Exceptional by F1000.
News and Views: http://msb.embopress.org/content/14/5/e8377
Aim 2: Identify and characterize IDRs that influence protein stability, and discover genes that regulate IDR mediated protein stability.
In the last couple of years, our team has assayed a viral proteome using the IDR-Screen approach and have discovered regions that act as strong degrons. More importantly, we are now performing followup screens to discover the ubiquitin ligases that regulate the degron activity. The work describing the project will be written up for publication next year.
Aim 3: Identify IDRs that can form higher-order assemblies/aggregates and discover genes/conditions involved in regulating assembly formation.
In the last year, our team has developed the assays for screening peptides that form aggregates and have assayed a viral proteome using the IDR-Screen approach to discover regions that can form higher order assemblies so far. We are exploring different sequences to be used as libraries for discovering regions that form higher order assemblies. However the grant had to be terminated as the PI moved to the USA.
Thanks to the generous funding through this ERC consolidator grant, we have now developed a targeted, high-throughput, multiplexed technology called Idr-Screen. The unique aspect of this approach is its integration of synthetic & systems biology, (un)structural biology, cell biology, genetics, and experiments and computation to establish a discovery platform to identify and characterise functional disordered regions directly in a cellular context. Given the emerging importance of IDRs and a newfound understanding of their biomedical relevance, the approach we have developed in this project can be and is being readily extended to investigate a broad range of functions of IDRs in a cellular context. For the first reporting period, we have already generated high-resolution data on sequence-function relationship of IDRs that can function as transactivation domains, which is fuelling development of methods for investigation of protein function, and interpretation of genome sequence of disordered regions.
The grant had to be terminated on 30 June 2020 because the PI moved to St Jude Children's Research Hospital in the USA as an Endowed Chair in Biological Data Science and as Director of Center for Data Driven Discovery.