Periodic Reporting for period 1 - ML Potentials (Constructing Intermolecular Potentials by Combining Physics and Machine Learning)
Reporting period: 2018-03-15 to 2020-03-14
Intermolecular potentials can be accurately quantified by applying high-level molecular quantum mechanics methods. Although these methods are physically sound and mathematically rigorous, their computational cost grows exponentially with the size of the system. Conversely, molecular mechanics compromises accuracy for speed, and approximates intermolecular interactions by parameterizing a classical potential energy function called a force field. Although molecular mechanics methods are applicable to systems with millions of atoms, they are unreliable for systems and physicochemical phenomena that are dissimilar to those used to parameterize the force field, which significantly limits their accurate applications.
These limitations motivated me to develop state-of-the-art machine-learning (ML) models to accurately and rapidly predict molecular interaction energies and forces. Just as human chemists learn from past experiences to make predictions about the properties of new molecules, in ML a mathematical model is trained to leverage prior experimental and/or computational results to predict the properties of new molecules. This proposal is motivated by the impressive success of recent statistical ML models for accurately predicting molecular energies and forces, but their ineluctable shortcomings for modeling intermolecular (long-range and non-local) interactions and their consequent failure to scale to larger systems. By modeling long-range interactions using quantum mechanics methods (where they are computationally affordable), I developed an ML method that can be trained using only small-to-medium-size molecules, but applied to larger molecules.
The main objectives achieved within the first half of the proposed action include: 1) developing a new methodology called Distance-Adapted version of SchNet (DASNet) which improve the existing state-of-the-art neural network architectures, 2) implementing a prototype of DASNet alongside designing the framework of the ultimate software package for predicting interaction energies based on the proposed model, 3) building the infrastructure for running quantum chemistry computations and compiling training data, 4) Releasing ChemTools software (free and open-source package which embodies a collection of interpretive tools for analyzing outputs of quantum chemistry calculations to gain chemical knowledge), 5) disseminating the action outcome by presenting at international workshops and conferences and organizing a hands-on workshop in Europe to promote Python programming language and teach ChemTools software package to a wide range of researchers.
While the practical impact of DASNet has not yet been realized, an immediate impact was achieved through the release of the free and open-source ChemTools software package, which I was developing mainly to provide training data for DASNet. However, ChemTools has had a broad impact: in less than a month since its initial release, ChemTools has been installed by more than 30 researchers from around the world, from diverse research fields including chemistry, biochemistry, and materials science and engineering.
I developed a refined version of the Schütt Network, called DASNet, that incorporates chemical knowledge about atomic properties and the key length-scales of intramolecular interactions. This refines and improves the methodology I had anticipated using in my original proposal. The free and open-source software package I am developing that implements DASNet will be disseminated like ChemTools, through hands-on sessions at international conferences and dedicated week-long workshops in the future.