## Periodic Reporting for period 1 - MaLeR (Machine Learning applied to Reactivity: combination of HDNNs with ReaxFF)

Reporting period: 2018-12-01 to 2020-11-30

Machine learning (ML) methods have become increasingly important in business and science for making predictions based on the substantial amount of data that many organizations collect or generate. ML methods are extremely powerful, since computers are much better suited than humans to identify correlations. In computational chemistry and materials science, a typical scientific goal is to predict properties of different molecules and compounds. Computer simulations generate new, physically meaningful data, based on some predefined rule set. In lieu of an actual experiment, which can be costly and dangerous for both people and the environment, simulations are used to answer the scientist’s questions.

The main scientific goal of this project was to explore and apply ML techniques to predict chemical reactivity of molecules and compounds. The rate of chemical reactions can be obtained through computer simulations, if the potential energy surface (the PES) is known. For each possible placement of some collection of atoms in 3-dimensional space, a single number, namely the potential energy, determines how likely that placement of atoms is at any given thermodynamic condition (temperature and pressure).

Predicting the potential energy surface for different molecules can be done through quantum-chemical calculations, for example based on density functional theory (DFT) or coupled cluster (CC). However, such calculations are computationally demanding, and it is not feasible to perform such calculations for very large (i.e. realistic) systems. The main scientific goal of this project was to provide a much faster way of evaluating the potential energy during computer simulations, parameterizing force fields and ML methods to predict the potential energy.

The main scientific goal of this project was to explore and apply ML techniques to predict chemical reactivity of molecules and compounds. The rate of chemical reactions can be obtained through computer simulations, if the potential energy surface (the PES) is known. For each possible placement of some collection of atoms in 3-dimensional space, a single number, namely the potential energy, determines how likely that placement of atoms is at any given thermodynamic condition (temperature and pressure).

Predicting the potential energy surface for different molecules can be done through quantum-chemical calculations, for example based on density functional theory (DFT) or coupled cluster (CC). However, such calculations are computationally demanding, and it is not feasible to perform such calculations for very large (i.e. realistic) systems. The main scientific goal of this project was to provide a much faster way of evaluating the potential energy during computer simulations, parameterizing force fields and ML methods to predict the potential energy.

The primary ML method in this project has been high-dimensional neural network potentials (HDNNPs). At the start of the project, there already existed several open-source packages which implement this particular method. One package in particular, TorchANI, stood out as the package distributors included with pre-made, generally applicable, force fields. Within this project, the fellow integrated the code into the Amsterdam Modeling Suite (AMS), allowing users to get DFT or CC-quality results for organic molecules almost instantly. Furthermore,

the integration of TorchANI was accomplished via a general and flexible interface that also supports other ML Python packages for predicting potential energy surfaces (such as SchNetPack, sGDML, and PiNN), allowing for many different types of ML methods to be used with AMS.

In a collaboration with researchers from Uppsala University, the fellow developed PiNN, an open-source Python package for constructing and evaluating atomic neural networks for molecules and materials. This package implements not only the high-dimensional neural network potentials, but also message-passing graph-convolutional neural networks, in particular the PiNet architecture which was developed in this project. One benefit of this approach, in comparison to HDNNPs, is that the features of the local atomic environments do not need to be constructed explicitly beforehand, but these features are instead learned by the ML method. The PiNet architecture can be parameterized to accurately predict the potential energy surface and its gradients (forces, and stress tensors for periodic systems). Moreover, it can be applied to directly predict properties, for example the formation energies of materials.

As a result of this project, it is now possible to run simulations inside the Amsterdam Modeling Suite combining a variety of low-level methods like ReaxFF, DFTB, or DFT with Machine Learning to improve the original predictions. The interface is very general and also allows to perform quantum-mechanics/molecular-mechanics (QM/MM) hybrid calculations. It is also possible to parametrize the corresponding ML methods.

Although the focus of this project was on Machine Learning, the fellow has also deepened his knowledge about other parametrized methods, in particular ReaxFF. ReaxFF is a reactive force field with many parameters that need to be fitted. The fellow co-developed ParAMS, a Python package for fitting parameters for any of the many methods implemented in AMS, including ReaxFF. ParAMS handles training set and validation set evaluations, features a variety of fitting algorithms, and has a special focus on transparency and reproducibility.

These new developments have been released as part of SCM’s Amsterdam Modeling Suite 2020, and are thus available to the materials modelling community. Likewise, the PiNN Python library is available on GitHub.

Finally, the fellow was involved in two more applied collaborations with experimental researchers. In the first, ML simulations were used to elucidate temperature effects on the ionic conductivity in concentrated alkaline electrolytes, in particular how deviations from the Nernst-Einstein theory of conductivity are amplified by proton transfer reactions. In the second collaboration, the fellow’s ML simulations and DFT calculations were used to elucidate the formation mechanism of a particular zeolite in very alkaline silica solutions. By combining these simulations with state-of-the-art experimental techniques, the collaborative effort could for the first time illustrate how an inorganic ion-pair could be a structure-directing agent during zeolite synthesis.

the integration of TorchANI was accomplished via a general and flexible interface that also supports other ML Python packages for predicting potential energy surfaces (such as SchNetPack, sGDML, and PiNN), allowing for many different types of ML methods to be used with AMS.

In a collaboration with researchers from Uppsala University, the fellow developed PiNN, an open-source Python package for constructing and evaluating atomic neural networks for molecules and materials. This package implements not only the high-dimensional neural network potentials, but also message-passing graph-convolutional neural networks, in particular the PiNet architecture which was developed in this project. One benefit of this approach, in comparison to HDNNPs, is that the features of the local atomic environments do not need to be constructed explicitly beforehand, but these features are instead learned by the ML method. The PiNet architecture can be parameterized to accurately predict the potential energy surface and its gradients (forces, and stress tensors for periodic systems). Moreover, it can be applied to directly predict properties, for example the formation energies of materials.

As a result of this project, it is now possible to run simulations inside the Amsterdam Modeling Suite combining a variety of low-level methods like ReaxFF, DFTB, or DFT with Machine Learning to improve the original predictions. The interface is very general and also allows to perform quantum-mechanics/molecular-mechanics (QM/MM) hybrid calculations. It is also possible to parametrize the corresponding ML methods.

Although the focus of this project was on Machine Learning, the fellow has also deepened his knowledge about other parametrized methods, in particular ReaxFF. ReaxFF is a reactive force field with many parameters that need to be fitted. The fellow co-developed ParAMS, a Python package for fitting parameters for any of the many methods implemented in AMS, including ReaxFF. ParAMS handles training set and validation set evaluations, features a variety of fitting algorithms, and has a special focus on transparency and reproducibility.

These new developments have been released as part of SCM’s Amsterdam Modeling Suite 2020, and are thus available to the materials modelling community. Likewise, the PiNN Python library is available on GitHub.

Finally, the fellow was involved in two more applied collaborations with experimental researchers. In the first, ML simulations were used to elucidate temperature effects on the ionic conductivity in concentrated alkaline electrolytes, in particular how deviations from the Nernst-Einstein theory of conductivity are amplified by proton transfer reactions. In the second collaboration, the fellow’s ML simulations and DFT calculations were used to elucidate the formation mechanism of a particular zeolite in very alkaline silica solutions. By combining these simulations with state-of-the-art experimental techniques, the collaborative effort could for the first time illustrate how an inorganic ion-pair could be a structure-directing agent during zeolite synthesis.

This project has delivered a much faster way of evaluating the potential energy of a molecular system thanks to the combination and parameterization of force fields and Machine Learning methods. The integration of the TorchANI package into the Amsterdam Modeling Suite allows modellers to get DFT or CC-quality results for organic molecules almost instantly. Furthermore, that integration was achieved in such a way that other ML Python packages for predicting potential energy surfaces can also be used with AMS.

Thanks to this project, Machine Learning approaches can be used to improve on the original predictions of a variety of low-level methods in the Amsterdam Modeling Suite, including quantum-mechanics/molecular-mechanics (QM/MM) hybrid calculations.

Another result from this project (co-developed by the fellow) is the ParAMS Python package for fitting parameters for methods such as ReaxFF. This is an important result, as ReaxFF is a powerful, in-demand method but its parameterization is a notoriously complex task.

These innovations have been released as part of SCM’s Amsterdam Modelling Suite (2020 release). The company has a proven track record bridging the academic and industrial sectors, and will ensure that the technology developed during the project will reach the materials modelling community, facilitating faster and more accurate modelling of the chemical reactivity of molecules and compounds.

Thanks to this project, Machine Learning approaches can be used to improve on the original predictions of a variety of low-level methods in the Amsterdam Modeling Suite, including quantum-mechanics/molecular-mechanics (QM/MM) hybrid calculations.

Another result from this project (co-developed by the fellow) is the ParAMS Python package for fitting parameters for methods such as ReaxFF. This is an important result, as ReaxFF is a powerful, in-demand method but its parameterization is a notoriously complex task.

These innovations have been released as part of SCM’s Amsterdam Modelling Suite (2020 release). The company has a proven track record bridging the academic and industrial sectors, and will ensure that the technology developed during the project will reach the materials modelling community, facilitating faster and more accurate modelling of the chemical reactivity of molecules and compounds.