Periodic Reporting for period 2 - AutoCheMo (Automatic generation of Chemical Models)
Periodo di rendicontazione: 2020-09-01 al 2022-08-31
The automatic derivation of complex chemical models from molecular simulations has the potential to become a very cost-effective tool in the design of industrial chemical reactors. This has driven the enormous progress in the field over the past decades but the exploration of complex reaction networks currently still requires an extensive (and sometimes unfeasible) amount of manual labor. AutoChemo set out to surmount these manual bottlenecks by extending established methodologies (such as the ChemTraYzer and ReaxFF codes for modelling reactivity, and transition state theory) and by addressing their main limitations (scaling towards extended systems, overall reliability and the quantum-mechanical (QM) description of anharmonic modes, respectively) with new theoretical models and their corresponding implementations in user-friendly simulation software.
The project contained four well-aimed research-oriented work packages, with one fellow taking the lead in each topic but with plenty of opportunities to collaborate and exchange results. Each topic touches upon the expertise of all partners, yet they all have their own commercial or academic perspective on the project. Research actions were organized in conjunction with local and network-wide training, including academic and industrial specialist courses, transferable skills training, (international) workshops and training-through-research. Furthermore, the novel methods developed during the action have been released to the modelling community as part of SCM's Amsterdam Modeling Suite (AMS).
The PhD project of Felix Schmalz was centered around the development and implementation of tools to automatically analyze simulations of large molecules and complex mechanisms to determine relevant reaction classes, reduced lumped mechanisms and reaction network graphs that provide physical insight to the user. He began focusing on the development of a new version of ChemTraYzer, a powerful software package for processing and analyzing chemical trajectories developed by the Leonhard group. The software code was refactored, and an alternative approach for the description of molecular substructures was introduced and described in a paper: so-called subgraph fingerprint descriptors, which facilitate the partial description of molecules and active complexes, which in turn opens the door to the classification and grouping of reactions based on a similarity measure. The ChemTraYzer code featuring the subgraph description was integrated in the AMS software package of the industrial partner SCM along with documentation and a tutorial to facilitate its usage by other researchers.
The project of PhD student Leonid Komissarov was devoted to the development and implementation of tools to automatically create more accurate force fields. ReaxFF is a reactive force-field for large-scale molecular dynamics simulations with chemical reactions. It is a powerful method, but it relies on parameters that are specific to the system being modelled, and whose determination is usually a demanding task. Mr. Komissarov worked on a tool for the automated reparametrization of ReaxFF force fields, using high-level QM data for training purposes. This novel tool, which has been given the name ParAMS, has been validated in realistic and challenging applications described in two peer-reviewed publications, including a reparametrization of the popular GFN-xTB1 model. ParAMS has also been released alongside SCM's AMS suite.
The PhD fellow Michael Gustavo has been working on the development of new methods to efficiently reparameterize ReaxFF, initially making use of Bayesian statistics to detect when reactive molecular dynamics simulations enter a chemical space for which the employed ReaxFF parameters need to be refined. This is quite challenging, as the ReaxFF error function falls into perhaps the toughest class of optimisation problems due to its stochasticity, high dimensionality and expense of evaluation among other factors. Moreover, early work indicated that error estimates with conventional Bayesian frameworks were not reliable, and therefore not worth pursuing. Instead, we addressed the closely related topic of sensitivity analysis, an important factor in the difficult task of selecting relevant ReaxFF parameters for optimisation. The result has been the integration of Hilbert-Schmidt Independence Criterion (HSIC) sensitivity analysis into ParAMS and the new GloMPO code. This new methodology has been described in a paper and validated with an application to the parameterization of a Zinc-Sulfide ReaxFF force field, and GloMPO is currently being included in the AMS suite for public release.
Finally, the PhD project of Gabriel Rath aimed at the development of tools that automatically identify (intra)molecular degrees of freedom that require a QM treatment of anharmonic motion and that efficiently determine the potential energy hypersurface (PES) based on as few QM calculations as possible. Based on this PES, accurate thermochemical data and reaction rates can then be computed. Early work showed that, even for relatively small systems, the naïve Monte Carlo integration method considered in the original plan was too inefficient. To work around this, Mr Rath settled on implementing and using MISER, a Monte Carlo integration method that dramatically increases efficiency by recursively applying a stratified sampling strategy. However, MISER proved to be significantly less efficient than it could be in theory, so two improved versions of MISER have been developed that manage to improve efficiency and reduce uncertainty. These new approaches have been described in a paper and implemented in the new software tool CIMCI, a new, black-boxable method for handling molecular anharmonicity with Monte Carlo integration. This novel tool is currently being packaged as part of the next AMS release.