Periodic Reporting for period 3 - EMC2 (Extreme-scale Mathematically-based Computational Chemistry)
Reporting period: 2022-09-01 to 2024-02-29
The ambition of the EMC2 project is to achieve scientific breakthroughs in this field by gathering the expertise of a multidisciplinary community at the interfaces of four disciplines: mathematics, chemistry, physics, and computer science. We are indeed convinced that the only way to further improve the efficiency of the solvers, while preserving accuracy, is to develop physically and chemically sound models, mathematically certified and numerically efficient algorithms, and implement them in a robust and scalable way on various architectures (from standard academic or industrial clusters to emerging heterogeneous and exascale architectures).
EMC2 has no equivalent in the world: there is nowhere such a critical number of interdisciplinary researchers already collaborating with the required track records to address this challenge. Under the leadership of the 4 PIs, supported by highly recognized teams from three major institutions in the Paris area, EMC2 is developing disruptive methodological approaches and publicly available simulation tools, and applying them to challenging molecular systems. The project has strongly strengthened the local teams and their synergy enabling decisive progress in the field.
WP1: High-dimensional and large-scale problems in molecular simulation
WP2: Reduction of complexity
WP3: Validation and certification of the results
WP4: Challenging systems
Progress has been made in all these work packages in the period Sept. 1st 2019 – Aug. 31st 2023.Concerning the papers, 109 articles have been published in peer-reviewed international journals, another 151 preprints were submitted for publication and 6 PhD thesis (supported by EMC2 funds) have been defended. The main results achieved during the first 48 months of the project are the following:
WP1:
Progress in the development of tensor compression techniques and associated low rank matrix approximation algorithms (tasks 1.1 and 1.4).
First results obtained in using randomization techniques for solving large scale linear systems and eigenvalue problems (task 1.2).
Elucidation of the respective merits of direct minimization methods vs self-consistent field algorithms for large scale or ill-conditioned nonlinear eigenvalues problems arising in electronic structure calculation (task 1.3)
Exploration of the capability of auto-encoders to build efficient collective variables for enhanced sampling. Tests on challenging biological molecules confirm the interest of such machine-learnt collective variables, in particular when they are used in combination with free energy biasing techniques such as the Adaptive Biasing Force (task 1.5)
WP2:
Development of a systematic method for deriving effective independent-particle models for low-energy excitations of 2D materials directly from DFT (task 2.1).
First demonstration of QM/MM molecular dynamics simulation under extreme pressure (task 2.2).
First results of molecular dynamics simulations using a stable parametrization of the dd-COSMO along with the Poisson– Boltzmann and generalized Kirkwood model (task 2.3).
Development and finalization of a new multi-GPUs version of the Tinker-HP software (task 2.4)
Progress in the mathematical analysis and development of new, efficient numerical methods for mean-field models contribution in the field of sampling with the introduction of the BOUNCE integrator and in the field of quantum nuclear effects with a new generation Adaptive Quantum Thermal Bath algorithm (task 2.4).
Progress in two classes of methods investigated and further developed at ENPC and Inria (accelerated dynamics and Adaptive Multilevel Splitting) and successful applications to challenging problems (task 2.5).
Development from scratch of Deep-HP, the machine learning extension of the Tinker-HP molecular dynamics package (task 2.6).
WP3:
Development from scratch of DFKT, an electronic structure package in the Julia language, and a suitable tool to develop and test algorithms and error estimators for DFT
Development of error estimators for the coupled-cluster method (task 3.1)
New contributions to the a priori and a posteriori error analysis of electronic structure calculation methods and use to propose efficient algorithms implemented in tested in DFTK (tasks 3.2 3.3 3.5)
WP4:
First mathematical analysis of the Density-Matrix Embedding Theory (DMET), a representative of the class of quantum embedding methods (task 4.1).
First large scale microseconds simulations related to COVID-19 using the Tinker-HP new GPU infrastructure and polarizable force fields (task 4.2).
Formal derivation of an effective moiré-scale model for twisted bilayer graphene from DFT (task 4.3).
- develop efficient algorithms for molecular simulation based on cutting-edge techniques from scientific computing. Our goal is to develop and analyze entirely new numerical methods and algorithms for molecular simulation problems leading to linear or nonlinear systems of equations or sampling problems, that are characterized by high dimensionality. The algorithms will be numerically sound, they will exploit massive parallelism in both space and time, through deterministic or stochastic approaches, they will minimize the communication, and as a by-product also reduce the energy consumption of the simulation. One important goal is to create a tensor library that deals with high- dimensional problems and achieves the highest possible efficiency on massively parallel machines;
-make a detailed analysis of the chain of approximations leading from an extremely accurate but computationally intractable model (e.g. the N-body Schrödinger equation) to a reduced model which can be simulated with the available computer resources, and derive mathematical criteria assessing the range of validity of these approximations. The main purpose is to mathematically assess the quality and the range of applicability of various existing or new reduced models for chemically relevant applications and to give a route toward the large-scale system applicability in connection with the HPC tasks of the project;
-introduce certification and validation methods (i.e. a posteriori error bounds, computer proofs of programs) allowing one to complement simulation results with mathematically guaranteed error bars. A deliverable of the task will be the release of a version of DTFK providing a fully certified ground state of the reduced Hartree-Fock model, and a partially certified ground state of the Kohn-Sham model (whose non-convexity forbids a truly guaranteed solution), in a plane-wave basis. This means that the energies and forces obtained by the code will be proven correct (inside the particular model chosen) up to a specified error.
These methodological advances will be applied to three classes of challenging systems of major practical interest: large solvated biosystems (with an emphasis on COVID-19), molecules with strongly-correlated electrons, 2D materials.
Since the inception of the project and due to its strong interdisciplinary aspect, we decided to add another line of research dedicated to quantum computing for chemistry.