Skip to main content
Weiter zur Homepage der Europäischen Kommission (öffnet in neuem Fenster)
Deutsch Deutsch
CORDIS - Forschungsergebnisse der EU
CORDIS

Emergence of New Phases in Biopolymer Systems

Periodic Reporting for period 2 - EMPHABIOSYS (Emergence of New Phases in Biopolymer Systems)

Berichtszeitraum: 2022-06-16 bis 2023-06-15

The aim of the EMPHABIOSYS project is to understand proteins, the molecular machines of life that are key actors in virtually all biochemical processes within the cell. At a biochemical level, proteins are short chain molecules built from a menu of 20 amino acids, each bearing a distinct side-chain, linked together into a linear polypeptide chain. The proteins we observe in nature today have evolved to perform specific functions. They are physiologically active only when their linear chain folds in aqueous solution into a unique three-dimensional structure characteristic of each protein. The knowledge of this so-called native state structure of a protein is crucial for understanding its biological function. The way a protein can efficiently, reversibly, and reproducibly acquire its unique native state, starting from an extended, random-coil configuration, is the so-called protein folding problem. It represents a remarkable example of a self-assembly process that has so far eluded a complete explanation, notwithstanding more than 50 years of intensive research. Protein folding remains one of the fundamental open questions across the fields of contemporary molecular biology, biochemistry, and biophysics. The journal Science on its 125th anniversary in 2005, classified it among the Top100 most important and challenging problems facing scientists in the next quarter of century.The solution of the protein folding problem is of paramount theoretical and practical importance. It will have immediate impact on molecular biology, drug design and nanotechnology. Inspired by nature, the ability to understand and mimic this biological mechanism would lead to novel ways of fabricating biomaterials, engineered through a bottom-up approach. More importantly, it will contribute to the societally important issue of human health through an understanding of the principles behind protein folding (and misfolding). These are responsible for cell function and malfunction. Examples include amyloid formation implicated in human neurodegenerative diseases such as Alzheimer’s and Parkinson’s and type 2 diabetes due to the misfolding of insulin protein. Protein folding problem is very complex: 20 different amino acid types, the role of a water as a solvent; a huge number of degrees of freedom is involved. The puzzle is how proteins succeed to find their respective unique native states so quickly? Biological folding times are in the range from microseconds to seconds, whereas one might have naively expected astronomically large times if the search had been random among the enormously large number of possible conformations. Despite all this, proteins share remarkable common properties: they are made of building blocks of topologically one-dimensional alpha-helices and almost planar, effectively two-dimensional beta-sheets connected by loops and assembled into the three-dimensional native state structures. They fold rapidly and efficiently, and act as amazingly effective molecular machines. The overall objective of the project is to find from first principles, using minimum number of essential ingredients, the hidden simplicity that must underlie protein problem and explain all remarkable common properties of proteins.
The native-state structures of globular properties are stable and well-packed indicating that self-interactions are favored over protein-solvent interactions under folding conditions. We use this drive for compactness as a guiding principle, while respecting the correct symmetry of a protein chain, for which we postulate to be the cylindrical one. We take into account the chain discreteness and common protein backbone, and thus view the protein chain as a set of equidistant thin coins. We work out the geometries of backbone conformations that allow for the systematic touching of coins to derive both the value of the coin diameter, as well as geometries of the building blocks of protein structures: alpha-helices and strands assembled in two types of beta-sheets. We do so with no adjustable parameters, no amino acid sequence information, and no chemistry. We find an almost perfect fit between the dictates of mathematics and physics and the rules of quantum chemistry. On the other hand, the assembly of these secondary structure elements, and thus the formation of protein native (tertiary) structure, interactions mediated by side-chains come at play. Side-chains have a range of geometries and chemistries and there is not any simply defined, let alone universal, object or orientation describing all of them. Nevertheless, a simple and yet powerful concept of pairwise poking attractive interactions (for two slices of discrete chain that ‘poke’ towards each other, as a mutual local minima) that are found to be sustained in a repetitive manner by the unique geometries of alpha-helices and beta-sheets, allows us to capture this complexity in a simple, albeit approximate, manner. This solvent-mediated emergent pairwise attraction assembles protein building blocks, while respecting their individual symmetries. Instead of seeking to mimic the complexity, we look for an abstraction of reality and present a simple geometrical model of a chain, which captures the essential features of globular proteins. Ground states of our model are nearly degenerate in energy protein-like geometries akin of protein native state structures. This proves that the free energy protein landscape is pre-sculpted at the backbone level, and shows why protein structures are unique in being simultaneously characterized by stability, diversity, and sensitivity.
The current state of the art can be summarized in a textbook paradigm: ‘sequence determines structure’ for proteins. The novelty of our view is that protein sequences rather select from the menu of folds predetermined by symmetry and geometry, that presents an enormous simplification for evolution and natural selection: while sequences and functionalities evolve, they do so within the fixed backdrop of the immutable protein folds. Fast development of machine learning lead to a recent breakthrough in the accuracy with which AI program AlphaFold of Google’s Deep Mind assigns a given protein sequence the correct structure, and it has been chosen the scientific discovery of the year 2021 by Science journal. Our work provides an explanation why machine learning is wonderfully suited to, and enormously successful in, matching a sequence to its native state structure. Our theoretical framework allows to comprehend the nature and the mechanism behind the formation of globular and amyloid protein structures, and has the potential to improve the predictability of machine learning algorithms. Since symmetry plays a key role in determining the phases of matter and because our results are completely independent of any microscopic details, they strongly point towards the existence of a new phase of matter in which protein native state structures reside. The existence of such a phase naturally opens more general questions: does life exist elsewhere in our cosmos? If yes, does it have to be based on what we know about life on earth? Can one imagine creating nifty nanomachines, without relying on carbon chemistry, in the lab based on lessons learned from our framework? And could one conceive the beginnings of artificial life facilitated by a network of such machines working harmoniously together?
Theoretically derived geometry of protein alpha-helix.
Examples of distinct protein topologies degenerate in energy, resulting from our geometrical model.
Theoretical prediction of two distinct assemblies of a pair of zig-zag beta strands.
Mein Booklet 0 0