Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano italiano
CORDIS - Risultati della ricerca dell’UE
CORDIS

Cryo-electron microscopy: mathematical foundations and algorithms

Periodic Reporting for period 4 - CRYOMATH (Cryo-electron microscopy: mathematical foundations and algorithms)

Periodo di rendicontazione: 2021-09-01 al 2023-02-28

The importance of understanding the functions of the basic building blocks of life, such as proteins, cannot be overstated, as this understanding unravels the mechanisms that control all organisms. The critical step towards such an understanding is to reveal the structures of these building blocks. A leading method for resolving such structures is cryo-electron microscopy (cryo-EM), in which the structure of a molecule is recovered from its images taken by an electron microscope, by using sophisticated mathematical algorithms. Due to hardware breakthroughs in recent years, cryo-EM has made a giant leap forward, introducing capabilities that until recently were unimaginable, opening an opportunity to revolutionize our biological understanding.

As extracting information from cryo-EM experiments completely relies on mathematical algorithms, the method’s deep mathematical challenges that have emerged must be solved for cryo-EM to realize its tremendous potential. These challenges focus on integrating information from huge sets of extremely noisy images (up to millions of images per data set) reliability and efficiently.

This project addresses the three key open challenges of cryo-EM data processing – a) deriving reliable and robust reconstruction algorithms from cryo-EM data, b) developing tools to process heterogeneous cryo-EM data sets, and c) devising validation and quality measures for structures determined from cryo-EM data. The fourth goal of the project, which ties all goals together and promotes the broad interdisciplinary impact of the project, is to merge all our algorithms into a software platform for state-of-the-art processing of cryo-EM data.
Since the beginning of the project, significant progress has been obtained for objective (a) "deriving reliable and robust reconstruction algorithms from cryo-EM data". The goal of this objective is to design algorithms that take as an input many two-dimensional images generated by the electron microscope, and estimate a three-dimensional model of the molecule, using only the given images (in particular, without assuming any knowledge about the structure of the investigated molecule). This process is called "ab-initio" modeling. Obtaining accurate ab-initio models is a critical step in the cryo-EM structure determination process, as these models are then refined to high-resolution models, with the accuracy of the high-resolution models critically depending on the accuracy of the ab-initio models. Thus, the algorithms we develop must be accurate and robust to the high levels of noise in the input data. The problem is further complicated by the fact that the investigated molecule can have symmetries, since as it turns out, the geometry of the mathematical reconstruction problem is different for each of the possible symmetries. As there are only finitely many possible symmetries in three-dimensions, we set to derive an algorithm for each of these symmetries.

Thus far in the project, we derived an improved algorithm for reconstructing molecules without symmetry, algorithms for molecules with cyclic symmetry (published in Inverse Problems), an algorithm for molecules with D2 (dihedral) symmetry (published in SIAM Journal on Imaging Sciences), and an algorithm for molecules with tetrahedral (T) and octahedral (O) symmetry (under second round of review in SIAM Journal on Imaging Sciences). We are at the final stages of developing an algorithm for molecule with Dn symmetry. We failed to solve the case of icosahedral (I) symmetry.


We also made progress with objective (b) “developing tools to process heterogeneous cryo-EM data sets”. A heterogeneous data set is one that contains images of different molecules, or of a single molecule at “different states” (known as conformations). We developed an algorithm to separate a heterogeneous data set into homogeneous subsets, by casting this separation as the problem of partitioning the nodes of a graph into two “consistent” groups. We proved accuracy and stability bounds on our algorithm, and demonstrated it on simulated as well as experimental data sets. A paper describing this work has been published in Journal of Mathematical Analysis and Applications. We also analyzed mathematical models for heterogeneity, resulting in two papers, one in Information and Inference and one in Statistics and computing.

For objective (c), we developed a particle picking algorithm, which is fundamentally different from other approaches to the problem. Existing methods rely either on manual labeling of a rather large number of particles, or on templates provided by the user. Thus, the particle picking step in the current cryo-EM data processing pipeline is labor intensive, error-prone, and susceptible to model bias (towards the given templates). In our research, we have shown that it is possible to automatically estimate the optimal templates for particle detection given only the input micrographs. Based on this idea we developed a particle picking algorithm which does not suffer from the abovementioned shortcomings, and in particular, does not require manual labeling, nor parameter tuning. This work has been published in Journal of Structural Biology, and the accompanying software is available as open source. We then extended this work to handle contaminations in micrographs, a work which was also published in Journal of structural Biology.

Much progress has been achieved also with objective (d) “developing a publicly available software toolbox implementing the proposed algorithms”. With a programmer hired using the project’s funds, we ported all our algorithms to python, creating a standalone free software package for structural biologists and developers.

During the project, I have also established a collaboration with Prof. Natan Nelson from the Faculty of Life Sciences at Tel Aviv University. In this collaboration we use tools developed by my group to analyze cryo-EM data acquired by his lab. This joint work resulted in a major scientific advancement, with two papers published in Nature Plants and one paper in Biochimica et Biophysica Acta (BBA).

In terms of disseminating the outcomes of the research, from the beginning of the project and until today, 18 papers have been published in leading journals, and all resulting algorithms have been made publicly available.
Il mio fascicolo 0 0