Skip to main content
European Commission logo print header

Single Particle Cryo-EM Reconstruction with Convolutional Neural Networks

Periodic Reporting for period 1 - EM-PRIOR (Single Particle Cryo-EM Reconstruction with Convolutional Neural Networks)

Reporting period: 2020-08-01 to 2022-07-31

The initial objective was to explore machine learning methods that have been successful in other imaging modalities, like computed tomography, to produce more informative priors for cryo-EM structure determination. In cryo-EM data processing, the aim is to reconstruct an unknown 3D molecular structure from 2D projection images, which view the structure from unknown relative orientations. From a mathematical point of view, cryo-EM structure determination belongs to the field of ill-posed inverse problems. It is ill-posed because the high levels of noise and the many unknown parameters result in a situation where the data alone does not provide sufficient information to determine a unique solution. I proposed to use the regularization by denoising (RED) framework to inject prior knowledge into cryo-EM reconstruction to better handle the ill-posedness. The denoiser would be a deep neural network that is trained on cryo-EM data from publica databases.

By tapping into the vast amounts of prior knowledge about protein structures available in public databases, the proposed methods have the potential to not only make existing cryo-EM applications better, but also to enhance the scope of cryo-EM structure determination to many more targets than currently possible. This includes important drug targets, like GPCRs, that currently are outside the size limit of what can be resolved by cryo-EM to high resolution.
We showed that the proposed method was superior to the established approaches in a proof-of-principle paper with simulated data. The objective was then to show that this improvement also holds for experimental data. Here, the issue is to find suitable ground truth data to train the denoiser. We tried generative methods for establishing a model for the noise present in the reconstruction and the signal in high-resolution reconstructions. This required carefully curating a training dataset to train a generative model that could be run to generate ground truth data for the denoiser given an atomic model. This approach turned out to be moderately successful and we were able to create a denoiser that could give improved results. However, this approach turned out to be limited. Hence, we are now exploring an unsupervised approach to training the denoiser, which is based on the well-established approach called Noise2Noise. This approach requires no ground truth and thus avoids the main issue of the previous approach. Preliminary results are promising and we are in the process of writing a paper for this project.

In parallel to the above project we’ve developed methods for heterogeneous data processing that involve a novel take on variational autoencoder that involved classical machine learning with deep learning to improve speed and convergence with several orders of magnitude. This first part of this project has been submitted to the NeruIPS conference proceedings. We are in the process of writing the second part of this method.

Additionally, I worked on the development of novel algorithms for the software package RELION that improve the reconstruction pipeline in speed, quality of results and automation.
The RED project already shows improvements over the state of the art, but we see indications for additional improvements. We expect to resolve the final issues in the coming months and start distributing it along with the next version of RELION. This will enable fast dissemination to the field of structural biology. This will primarily enable the cryo-EM reconstruction of difficult biological targets with important implications for basic research and drug development.

Heterogeneous reconstruction in cryo-EM has an important impact for understanding the structural diversity and dynamics exhibited by biological molecular systems. In particular, combined with improvements in cryo-electron tomography of in situ samples, these computational methods should enable tremendous insights into the structural diversity associated with different cellular compartments.

Improvements in automated image processing that I introduced in RELION are now being used in cryo-EM facilities around the world for efficient on-the-fly processing of data.
(Panel A) Overview of method for the first part of heterogeneous reconstruction

Related documents