Periodic Reporting for period 4 - 3DPROTEINPUZZLES (Shape-directed protein assembly design)
Période du rapport: 2022-12-01 au 2024-05-31
Current approaches for protein self-assembly design do not result in assemblies with the required structural complexity to encode many of the sophisticated functions found in nature. Although impressive-looking protein containers have rationally been designed they have shortcomings such as large pores on the surface and a lack of mechanism to assemble and disassemble the containers when loading them with molecules. Current methods for de novo design of protein containers have produced a limited number of working systems that can compete with containers derived from nature. There is a need to develop approaches for de novo design of containers that can substantially expand the number of working systems with more diverse structural and self-assembly properties.
In this project, we have developed a new protein design paradigm, shape-directed protein design, to address the shortcomings of current methodologies. The methodology combines 3D geometric shape matching to identify protein building blocks that can form tightly assembled protein containers, an approach to optimize the relative orientation of protein building blocks that honors the complex symmetry of the container system, and sequence design algorithms based on artificial intelligence.
Because assembly design is a complex task with an expected limited success rate it is important to be able to screen many different constructs and to do it cost-effectively. A central goal of this proposal has been to develop new methods for high-throughput screening of protein variants and mutants. This involves developing in vivo fluorescence assays for protein stability and expression that is compatible with self-assembled proteins.
The design pipeline depends on several new computational technologies developed during the project. An efficient library for 3D shape matching with Zernike descriptors was developed. A method for aligning proteins based on surface shape, ZEAL, was developed. A highly all-atom protein docking method based on differential evolution was developed, EvoDOCK. EvoDOCK is capable of docking heterodimeric complexes with high accuracy when compatible backbones are available for the binding partners and with an order of magnitude more computational efficiency. A symmetric version of EvoDOCK enables the docking of proteins with cubic symmetry, which enables de novo prediction of proteins with cubic symmetry starting from a protein sequence. This is based on a combination of EvoDOCK and Alphafold2, where Alphafold2/Alphafold3 is not able to predict structures of this kind.
To enable efficient screening an optimization of designed proteins a method for screening for expression and stability in vivo was developed. This methodology enables the relative stability of a large number of protein variants to be evaluated in one experiment by expressing a DNA library in E. Coli and screening with fluorescence-activated cell sorting. The methodology is based on the concept of monitoring the protein quality control system when a protein variant is expressed in the cell. This system also allowed us to investigate how chaperones inside cells respond to folding stress. We have also developed genetic constructs that allow us to monitor the expression of protein capsid proteins without having to fuse every subunit to a big fluorescent protein. In combination with the stability and expression screening methodology it enables rapid testing of many designs, as well as directed evolution of protein capsid designs.
The experimental and computational methodology developed so far in this project can be of great value to researchers outside this project, and find applications in the industrial setting as well.
The ability to screen a large number of variant proteins in a fast and cost-effective manner is central to tackling complex protein design challenges. We have developed a high-throughput assay in which millions of proteins can be screened for stability and protein expression in a single experiment. This will be broadly applicable in many areas of protein engineering and we have also shown that we can use the system to study the protein quality control system in bacteria. We have also introduced a fully automated setup for cloning, expression, and purification of design variants for in vitro studies that can readily be used to test hundreds of constructs without much manual labor. The project has also resulted in a set of unique tools for protein design, structural analysis, and structure prediction of large assemblies.