Objective
Enzymes are biological catalysts indispensable for biotechnology. Conventional approaches to enzyme design and optimization, relying on biochemical intuition and combinatorial mutagenesis, have yielded significant success over decades. Building on these foundations, the TerpenCode project aims to instantly elucidate and engineer enzymatic reactions by designing a new generation of deep learning models that (1) incorporate biochemical principles as inductive biases and (2) model all intermediate biochemical transformations that occur sequentially in the active site of each enzyme. We will focus on terpene synthases, which produce the core hydrocarbon scaffolds of terpenoids, the largest and most diverse class of natural products. My group has already curated a comprehensive training dataset comprising thousands of terpene synthase reaction mechanisms. In Objective O1, we will develop deep learning models for predicting the substrates, products, and reaction mechanisms of terpene synthases directly from their amino acid sequences. In Objective O2, we propose to engineer a generative machine learning algorithm for designing new variants of terpene synthases with altered quantitative product distribution, adjusted product stereochemistry, or new reaction cascades that lead to novel terpene products. We will experimentally validate these models by yeast expression experiments, including complete chemical structure elucidation of the detected reaction products. Breakthrough progress on these objectives would be a key important step towards the holy grail of biotechnology: providing a computational prediction of the exact enzyme function from its amino acid sequence and instant de novo generation of new enzymes for catalyzing desired biochemical reactions for an important class of enzymes. Generalizing our solutions further to other classes of enzymes would enable sustainable biotechnological production of a broad spectrum of new-to-nature chemicals and bioactives.
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
- natural scienceschemical sciencesorganic chemistryhydrocarbons
- natural scienceschemical sciencescatalysis
- natural sciencesbiological sciencesbiochemistrybiomoleculesproteinsenzymes
You need to log in or register to use this function
Keywords
Programme(s)
- HORIZON.1.1 - European Research Council (ERC) Main Programme
Funding Scheme
HORIZON-ERC - HORIZON ERC GrantsHost institution
16610 Praha 6
Czechia