Skip to main content
European Commission logo print header

Extreme-scale Mathematically-based Computational Chemistry

Periodic Reporting for period 2 - EMC2 (Extreme-scale Mathematically-based Computational Chemistry)

Reporting period: 2021-03-01 to 2022-08-31

Molecular simulation has become an instrumental tool in chemistry, condensed matter physics, molecular biology, materials science, and nanosciences. It will allow proposing de novo design of e.g. new drugs or materials provided that the efficiency of underlying software is accelerated by several orders of magnitude.

The ambition of the EMC2 project is to achieve scientific breakthroughs in this field by gathering the expertise of a multidisciplinary community at the interfaces of four disciplines: mathematics, chemistry, physics, and computer science. We are indeed convinced that the only way to further improve the efficiency of the solvers, while preserving accuracy, is to develop physically and chemically sound models, mathematically certified and numerically efficient algorithms, and implement them in a robust and scalable way on various architectures (from standard academic or industrial clusters to emerging heterogeneous and exascale architectures).

EMC2 has no equivalent in the world: there is nowhere such a critical number of interdisciplinary researchers already collaborating with the required track records to address this challenge. Under the leadership of the 4 PIs, supported by highly recognized teams from three major institutions in the Paris area, EMC2 is developing disruptive methodological approaches and publicly available simulation tools, and applying them to challenging molecular systems. The project has strongly strengthened the local teams and their synergy enabling decisive progress in the field.
The project is organized in four work packages:

WP1: High-dimensional and large-scale problems in molecular simulation
WP2: Reduction of complexity
WP3: Validation and certification of the results
WP4: Challenging systems

Progress has been made in all these work packages. In the period Sept. 1st 2019 – Aug. 31st 2021, 31 articles have been published in peer-reviewed international journals, another 40 preprints were submitted for publication and 2 PhD thesis (partly supported by EMC2 funds) have been defended. The repartition in terms of work packages is 33/71 for WP1, 15/71 for WP2, 13/71 for WP3, 10/71 for WP4. The main results achieved during the first 24 months of the project are the following (the most significant of them will be presented more in detail in Section 2):

WP1:
Progress in the development of tensor compression techniques and associated low rank matrix approximation algorithms (tasks 1.1 and 1.4).
First results obtained in using randomization techniques for solving large scale linear systems (task 1.2).
Progress in the mathematical analysis and development of new, efficient numerical methods for mean-field models (task 1.3) and phase space sampling (task 1.5).

WP2:
First demonstration of QM/MM molecular dynamics simulation under extreme pressure (task 2.2)
First results of molecular dynamics simulations using a stable parametrization of the dd-COSMO along with the Poisson–Boltzmann and generalized Kirkwood model (task 2.3).
Development and finalization of a new multi-GPUs version of the Tinker-HP software (task 2.4)
New contribution in the field of sampling with the introduction of the BOUNCE integrator and in the field of quantum nuclear effects with a new generation Adaptive Quantum Thermal Bath algorithm (task 2.4).

WP3:
Development from scratch of DFKT, an electronic structure package in the Julia language, and a suitable tool to develop and test algorithms and error estimators for DFT
New contributions to the a priori and a posteriori error analysis of electronic structure calculation methods (tasks 3.2 3.3 3.5)

WP4:
First large scale microseconds simulations related to COVID-19 using the Tinker-HP new GPU infrastructure and polarizable force fields (task 4.2).
The EMC2 project focuses on three complementary yet intertwined methodological challenges:

- develop efficient algorithms for molecular simulation based on cutting-edge techniques from scientific computing. Our goal is to develop and analyze entirely new numerical methods and algorithms for molecular simulation problems leading to linear or nonlinear systems of equations or sampling problems, that are characterized by high dimensionality. The algorithms will be numerically sound, they will exploit massive parallelism in both space and time, through deterministic or stochastic approaches, they will minimize the communication, and as a by-product also reduce the energy consumption of the simulation. One important goal is to create a tensor library that deals with high- dimensional problems and achieves the highest possible efficiency on massively parallel machines;

-make a detailed analysis of the chain of approximations leading from an extremely accurate but computationally intractable model (e.g. the N-body Schrödinger equation) to a reduced model which can be simulated with the available computer resources, and derive mathematical criteria assessing the range of validity of these approximations. The main purpose is to mathematically assess the quality and the range of applicability of various existing or new reduced models for chemically relevant applications and to give a route toward the large-scale system applicability in connection with the HPC tasks of the project;

-introduce certification and validation methods (i.e. a posteriori error bounds, computer proofs of programs) allowing one to complement simulation results with mathematically guaranteed error bars. A deliverable of the task will be the release of a version of DTFK providing a fully certified ground state of the Hartree-Fock model, and a partially certified ground state of the Kohn-Sham model (whose non-convexity forbids a truly guaranteed solution), in a plane-wave basis. This means that the energies and forces obtained by the code will be proven correct (inside the particular model chosen) up to a specified error.

These methodological advances will be applied to three classes of challenging systems of major practical interest: large solvated biosystems (with an emphasis on COVID-19), molecules with strongly-correlated electrons, 2D materials.
logo-emc2.jpg