Periodic Reporting for period 1 - DaVinci-Switches (Designing Allosteric Protein Switches by In Vivo Directed Evolution and Computational Inference)
Reporting period: 2022-09-01 to 2025-02-28
chemicals is a central goal in synthetic biology. The design of switchable proteins, in particular single-chain, allosteric variants, however,
is a challenging engineering problem thus-far mostly addressed by trial-and-error. DaVinci-Switches takes a radically new, data-driven
perspective to fundamentally advance our understanding of protein allostery and accelerate and eventually rationalize the engineering of
switchable proteins by interfacing synthetic biology with machine learning. We will establish a 'design by directed evolution' approach
to create switchable proteins through receptor and effector fusion followed by phage-assisted in vivo directed evolution using synthetic
gene circuits for selection. We will apply this novel pipeline to a diverse set of effector proteins and monitor the evolutionary process by
next-generation sequencing (Objective 1). In parallel, we will perform an in-depth computational analysis of domain insertions within the
natural protein repertoire. The combined, rich datasets will be used to train machine learning models to infer sequence patterns predictive
of domain insertion tolerance and allosteric coupling between receptor-effector pairs (Objective 2). Finally, we will employ this unique
model to design light- and drug-inducible variants of the Yamanaka cell reprogramming factors. These will provide the foundation
of an Adeno-associated virus-based platform for cyclic, partial in vivo reprogramming of somatic cells with enormous potential for
regenerative medicine, which will be evaluated in a murine model of drug-induced liver injury (Objective 3). DaVinci-Switches harnesses
our key competences in protein engineering, synthetic biology and computation to reveal fundamental principles of allostery and enable
transformative advances in the design of switchable proteins for research and medicine.
We significantly expanded the phage-assisted evolution capabilities by integrating a retron-based indel mutagenesis method, allowing exploration of not only point mutations but also insertions and deletions, thereby dramatically broadening the accessible fitness landscape.
On the computational side, we developed ProDomino, a machine learning model that predicts viable insertion sites for domain insertion-based allosteric control. Trained on semi-synthetic datasets and refined using experimental data from Objective 1, ProDomino enables accurate inference of insertion points and allosteric sites across diverse protein families (Wolf et al., bioRxiv, 2024; https://www.biorxiv.org/content/10.1101/2024.12.04.626757v1(opens in new window)). Early applications of the model, including successful validation in complex effectors such as CRISPR-Cas9 and Cas12a, have demonstrated its versatility and significant potential for rational, scalable protein engineering.
Finally, we initiated preparatory steps for creating switchable mammalian transcription factors. We engineered highly potent light-switchable synthetic transcription factors (Gal4-VP64 and dCas9-VPR) using LOV2 domain fusions, establishing a strong technological basis for Objective 3 (Münch et al., originally posted on bioRxiv and now published in Nucleic Acids Research; https://pubmed.ncbi.nlm.nih.gov/39676667/(opens in new window)). These achievements lay the foundation for engineering switchable Yamanaka factors to enable inducible cellular reprogramming, extending the scope of the project toward therapeutic and regenerative applications.
On the computational side, our deep learning model ProDomino emerged as a powerful tool for domain insertion engineering. Trained on semi-synthetic datasets and experimentally fine-tuned, the model predicts viable allosteric insertion sites across diverse protein families with high accuracy. Early validations on complex effectors such as CRISPR-Cas9 and Cas12a, as well as additional proteins, demonstrated its broad applicability. Its potential to standardize and streamline domain insertion engineering marks a key advancement in protein switch design.