Skip to main content
European Commission logo
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Shape-directed protein assembly design

Periodic Reporting for period 4 - 3DPROTEINPUZZLES (Shape-directed protein assembly design)

Période du rapport: 2022-12-01 au 2024-05-31

Large protein complexes carry out some of the most complex functions in biology. Such structures are often assembled spontaneously from individual components through self-assembly. Engineering self-assembled protein complexes can enable a wide range of applications in biomedicine, nanotechnology, and materials science. These include targeted delivery of protein drugs into cells and specific compartments in cells, nanoreactors for efficient synthesis of molecules using enzymes, and synthesis of highly uniform nanoparticles. Developments in approaches to design protein assembly from the bottom up can thus unlock new approaches for medical treatment and ecologically sustainable production of chemicals.

Current approaches for protein self-assembly design do not result in assemblies with the required structural complexity to encode many of the sophisticated functions found in nature. Although impressive-looking protein containers have rationally been designed they have shortcomings such as large pores on the surface and a lack of mechanism to assemble and disassemble the containers when loading them with molecules. Current methods for de novo design of protein containers have produced a limited number of working systems that can compete with containers derived from nature. There is a need to develop approaches for de novo design of containers that can substantially expand the number of working systems with more diverse structural and self-assembly properties.

In this project, we have developed a new protein design paradigm, shape-directed protein design, to address the shortcomings of current methodologies. The methodology combines 3D geometric shape matching to identify protein building blocks that can form tightly assembled protein containers, an approach to optimize the relative orientation of protein building blocks that honors the complex symmetry of the container system, and sequence design algorithms based on artificial intelligence.

Because assembly design is a complex task with an expected limited success rate it is important to be able to screen many different constructs and to do it cost-effectively. A central goal of this proposal has been to develop new methods for high-throughput screening of protein variants and mutants. This involves developing in vivo fluorescence assays for protein stability and expression that is compatible with self-assembled proteins.
The outcome of the project has been a computational pipeline for the design of symmetric protein assemblies with cubic symmetry. Based on the surface shape of a compatible building block, which could be extracted from natural proteins that form protein containers or simulated de novo, natural protein domains with similar shapes are identified. After optimization of the position of the subunit in the symmetric assembly to enable high shape-complementarity the sequence of surface residues is designed to optimize interface interactions to stabilize the assembly. The methodology results in protein capsid models with exquisite shape complimentary that form closed shells with small pores, with a variation of capsid size and pore sizes that goes far beyond the current state of the art.

The design pipeline depends on several new computational technologies developed during the project. An efficient library for 3D shape matching with Zernike descriptors was developed. A method for aligning proteins based on surface shape, ZEAL, was developed. A highly all-atom protein docking method based on differential evolution was developed, EvoDOCK. EvoDOCK is capable of docking heterodimeric complexes with high accuracy when compatible backbones are available for the binding partners and with an order of magnitude more computational efficiency. A symmetric version of EvoDOCK enables the docking of proteins with cubic symmetry, which enables de novo prediction of proteins with cubic symmetry starting from a protein sequence. This is based on a combination of EvoDOCK and Alphafold2, where Alphafold2/Alphafold3 is not able to predict structures of this kind.

To enable efficient screening an optimization of designed proteins a method for screening for expression and stability in vivo was developed. This methodology enables the relative stability of a large number of protein variants to be evaluated in one experiment by expressing a DNA library in E. Coli and screening with fluorescence-activated cell sorting. The methodology is based on the concept of monitoring the protein quality control system when a protein variant is expressed in the cell. This system also allowed us to investigate how chaperones inside cells respond to folding stress. We have also developed genetic constructs that allow us to monitor the expression of protein capsid proteins without having to fuse every subunit to a big fluorescent protein. In combination with the stability and expression screening methodology it enables rapid testing of many designs, as well as directed evolution of protein capsid designs.


The experimental and computational methodology developed so far in this project can be of great value to researchers outside this project, and find applications in the industrial setting as well.
The project has demonstrated a new concept in protein design, shape-directed protein design. In shape-directed protein design protein building blocks for the construction of larger assemblies are selected based on their surface 3D enabling them to bind each other with exquisite shape complementarity to create tightly assembled higher-order complexes. A computational process has been developed to encode this concept, which enables the de novo design of protein containers starting from monomeric building blocks. The designed assemblies coming out of the process have unprecedented shape-complementarity forming closed shells with a large structural diversity and interface energetics that rival natural virus capsid proteins. Shape-directed protein design is a new paradigm that can be the basis for tackling many challenges in the design of large protein complexes.

The ability to screen a large number of variant proteins in a fast and cost-effective manner is central to tackling complex protein design challenges. We have developed a high-throughput assay in which millions of proteins can be screened for stability and protein expression in a single experiment. This will be broadly applicable in many areas of protein engineering and we have also shown that we can use the system to study the protein quality control system in bacteria. We have also introduced a fully automated setup for cloning, expression, and purification of design variants for in vitro studies that can readily be used to test hundreds of constructs without much manual labor. The project has also resulted in a set of unique tools for protein design, structural analysis, and structure prediction of large assemblies.
EVODOCK approach for all-atom docking
Software develop for alignment of proteins matched by geometric shape-matching
Method for high throughput stability measurment
Prediction of symmetrical assembly structure with EvoDOCK