A Bayesian Framework for Cellular Structural Biology

Informations projet

BayCellS

N° de convention de subvention: 294809

Projet clôturé

Date de début 1 Mai 2012

Date de fin 30 Avril 2017

Financé au titre de

Specific programme: "Ideas" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013)

Coût total

€ 2 130 212,00

Contribution de l’UE

€ 2 130 212,00

2 130 212,00

Coordonné par

INSTITUT PASTEUR
France

Final Report Summary - BAYCELLS (A Bayesian Framework for Cellular Structural Biology)

The functioning of a single cell or organism is governed by the laws of chemistry and physics. In fine detail, the bridge from biology to chemistry and physics is provided by structural biology: to understand the functioning of a cell, it is necessary to know the atomic structure of macromolecular assemblies, which may contain hundreds of components. However, knowing the static structure is not enough; it is necessary to investigate and understand the dynamic interplay of these hundreds of components. This understanding requires bridging from the atomic scale resolution of structural biology to the much longer length scales that are the realm of cellular biology. It is a specificity of the molecular systems that these length scales are not always well separated: a small perturbation (binding of a small number of small molecules) can have a very large effect. A detailed understanding of the molecular systems of life forms the basis of innovative therapeutic strategies, by identifying new drug targets and new ways of interfering with pathogenic processes.
To investigate and understand the dynamic multi-component molecular systems and their spatio-temporal complexity is the biggest challenge for structural biology today. Traditionally, the field has been dominated by the study of static structures of isolated molecules with a single technique, most prominently X-ray crystallography. To understand the formation and evolution of transient complexes within a living cell, this type of high resolution structure determination by a single technique will likely stay the exception, and we need to acquire data with multiple biophysical techniques at multiple scales, in an integrative structural biology approach. These data must be integrated into one consistent, dynamic, picture that relies both on emerging experimental technologies and on molecular modelling and numerical simulations, in a truly integrative approach.
Bayesian approaches have decisive advantages and are increasingly being used in the wider context of structural biology. We pioneered an approach, Inferential Structure Determination, which we had first developed for NMR data only, and which goes beyond this and treats structure determination itself as a Bayesian data analysis problem. Bayesian approaches can determine all unknown “nuisance” parameters during the structure calculation, and to determine rigorous error estimates for the coordinates and all unknown parameters. This is of particular importance in the case of integrative structural biology, where data come from different sources with their approximate forward models to relate structures to data, their nuisance parameters, and various levels of molecular description. The disadvantage of Bayesian methods is their increased computational complexity, compared to standard structure calculation approaches.
Within the project, we developed a Bayesian framework for integrative structural biology. This includes the development of forward models for all relevant data used in integrative approaches. Notably, we developed forward models for data from chemical cross-linking and mass spectrometry, including a fast method that to verify if a cross link transverses a protein in a given model; we developed an an Bayesian method to merge small angle scattering (SAS) curves from multiple experiments and a forward model for SAS data from X-ray or neutron scattering; a new fast and scalable representation of molecules and electron microscopy data; and we also developed a Bayesian approach to interpret chromosome capture experiments, to characterise genome organisation. We also developed approaches to use evolutionary information in isolation or in combination with other experimental data (NMR, cryo-electron microscopy). Approaches take into account the ensemble nature of experimental data, even for sparse data, either in an implicit way (for SAS data) or in an explicit way during the modelling (for chemical cross-linking or NMR data).
The Bayesian analysis is based on sampling very large number of conformations and nuisance parameters. We developed new algorithms to sample molecular conformations compatible with experimental data which are based on recent advances in theoretical statistical mechanics, and a new coarse grained description, and to analyse very large numbers of conformations to classify and cluster them. This method can be used to analyse "big data" in general. All methods were used on a number of challenging and biology interesting systems, with data obtained either from the literature or from collaborating groups: pili from a type II secretion system; Pol II and Pol III RNA polymerases; complement factor; chromosome organisation.

Final Report Summary - BAYCELLS (A Bayesian Framework for Cellular Structural Biology)

Partager cette page Partager cette page sur les réseaux sociaux

Télécharger Télécharger le contenu de la page