## Mid-Term Report Summary - MSMATH (Molecular Simulation: modeling, algorithms and mathematical analysis)

The project aims at studying, from a mathematical point of view, algorithms which are used by physicists, chemists or biologists in order to model the evolution of materials at the molecular level.

At the molecular level, the basic modeling ingredient is a potential function which, to a given set of positions of the atoms, associates the energy of the system. Tools from statistical physics are then used in order to extract macroscopic information from the microscopic ingredients contained in the potential function. This information can be static (averages with respect to statistical ensembles of observables of interest) or dynamic (trajectories describing transitions between different macroscopic states of the materials under study). Molecular dynamics aims at simulating the evolution of the molecular systems in order to compute this information numerically, on computers. This is used on a daily basis either to get quantitative information (physical characteristics of some materials, transition times, etc) or qualitative information (transition mechanism, reaction pathways, etc). The algorithmic difficulty is twofold: first the number of particles is very large, which leads to high dimensional configuration vectors; second the timescale at the microscopic level is orders of magnitude smaller than the timescales of interest at the macroscopic level. This timescale discrepancy is related to the central concept of this project: metastability. Metastability refers to the fact that the stochastic process that models the physical system at the molecular level remains trapped in some regions of the configuration space for very long times. These regions are called the metastable states. Using naive simulations, transitions between these states are very rarely observed, whereas these transition events are actually those which matter at the macroscopic level. Metastability is one of the major bottlenecks in making molecular simulations predictive for real life test cases. The objective of the MSMath project is to mathematically characterize and quantify metastability, in order to analyze the efficiency and accuracy of algorithms which aim at circumventing the sampling difficulties raised by metastability.

In the first half of the project, progresses have been made in various directions. Concerning the sampling of statistical ensembles and the computation of static information, new adaptive biasing algorithms have been proposed, and the mathematical analysis of some adaptive biasing potential techniques lead to the development of improved variants. We also investigated new Metropolis Hastings techniques to compute transport coefficients, and the interest of using non reversible dynamics for improved sampling. Concerning the sampling of trajectories, we are working in two directions. First, we heavily rely on the notion of quasi-stationary distribution to characterize metastable states. This appears to be useful theoretically, in order to justify jump Markov models (kinetic Monte Carlo models) and numerically, to analyze accelerated dynamics algorithms. Thanks to this approach, we have been able for example to propose variants of the parallel replica dynamics which enlarge the range of applicability of this algorithm. Second, the adaptive multilevel splitting algorithm has been adapted to sample reactive trajectories and compute transition rates. This numerical method is currently implemented in molecular dynamics softwares, in collaboration with companies (CEA, SANOFI). Finally, various coarse-graining techniques are under study: dynamic reduction to get an effective dynamics along reduced degrees of freedom, dissipative particle dynamics, and greedy algorithms to get tensor product representation of high dimensional functions, such as the free energy.

In conclusion, in this project, various mathematical tools from the analysis of partial differential equations and the probability theory are used to quantify the metastability of stochastic processes which appear in molecular dynamics simulations. In close collaboration with practitioners (biologists and physicists) we analyze and develop algorithms to circumvent the difficulties raised by the simulation of metastable dynamics over very large time scales.

At the molecular level, the basic modeling ingredient is a potential function which, to a given set of positions of the atoms, associates the energy of the system. Tools from statistical physics are then used in order to extract macroscopic information from the microscopic ingredients contained in the potential function. This information can be static (averages with respect to statistical ensembles of observables of interest) or dynamic (trajectories describing transitions between different macroscopic states of the materials under study). Molecular dynamics aims at simulating the evolution of the molecular systems in order to compute this information numerically, on computers. This is used on a daily basis either to get quantitative information (physical characteristics of some materials, transition times, etc) or qualitative information (transition mechanism, reaction pathways, etc). The algorithmic difficulty is twofold: first the number of particles is very large, which leads to high dimensional configuration vectors; second the timescale at the microscopic level is orders of magnitude smaller than the timescales of interest at the macroscopic level. This timescale discrepancy is related to the central concept of this project: metastability. Metastability refers to the fact that the stochastic process that models the physical system at the molecular level remains trapped in some regions of the configuration space for very long times. These regions are called the metastable states. Using naive simulations, transitions between these states are very rarely observed, whereas these transition events are actually those which matter at the macroscopic level. Metastability is one of the major bottlenecks in making molecular simulations predictive for real life test cases. The objective of the MSMath project is to mathematically characterize and quantify metastability, in order to analyze the efficiency and accuracy of algorithms which aim at circumventing the sampling difficulties raised by metastability.

In the first half of the project, progresses have been made in various directions. Concerning the sampling of statistical ensembles and the computation of static information, new adaptive biasing algorithms have been proposed, and the mathematical analysis of some adaptive biasing potential techniques lead to the development of improved variants. We also investigated new Metropolis Hastings techniques to compute transport coefficients, and the interest of using non reversible dynamics for improved sampling. Concerning the sampling of trajectories, we are working in two directions. First, we heavily rely on the notion of quasi-stationary distribution to characterize metastable states. This appears to be useful theoretically, in order to justify jump Markov models (kinetic Monte Carlo models) and numerically, to analyze accelerated dynamics algorithms. Thanks to this approach, we have been able for example to propose variants of the parallel replica dynamics which enlarge the range of applicability of this algorithm. Second, the adaptive multilevel splitting algorithm has been adapted to sample reactive trajectories and compute transition rates. This numerical method is currently implemented in molecular dynamics softwares, in collaboration with companies (CEA, SANOFI). Finally, various coarse-graining techniques are under study: dynamic reduction to get an effective dynamics along reduced degrees of freedom, dissipative particle dynamics, and greedy algorithms to get tensor product representation of high dimensional functions, such as the free energy.

In conclusion, in this project, various mathematical tools from the analysis of partial differential equations and the probability theory are used to quantify the metastability of stochastic processes which appear in molecular dynamics simulations. In close collaboration with practitioners (biologists and physicists) we analyze and develop algorithms to circumvent the difficulties raised by the simulation of metastable dynamics over very large time scales.