Work on MachineCat included the development of an open source code package (SchNetPack) for machine learning in chemical systems together with other researchers of the host group. This code package was then used in a comparison of different ML approaches. Subsequently, an investigation on the interpretability of ML models in chemistry based on two prominent approaches and existing data sets was conducted. The insights gained in this manner were then applied to modeling an organic reaction. This resulted in a entirely new ML model called FieldSchNet, capable of describing the interactions of molecules with external environments and fields. Due to the its structure inspired by physics, FieldSchNet can operate in a plethora of different ways, which were explored in a study of solvent effects on molecular spectroscopy and reactions, as well as the design of molecular environments to enhance chemical reactions. At the same time, the SchNOrb model for predicting molecular wavefunctions was developed in collaboration with international researchers and members of the host group, bringing ML models even closer to high level computational methods. Research on these models also sparked the implementation of a new architecture for the simulation of photochemical phenomena (SchNarc) together with researches from the University of Vienna. In addition, a generative model for molecules (g-SchNet) was developed together with ML experts of the host group, the first ML model capable of automatically generating 3D structures of molecules. Following these method developments, the carbon dioxide conversion reaction was studied. The nature of the system necessitated further adaptations of the ML models in SchNetPack, was well as a significant extension of the simulation capabilities of the package. In order to make it possible to perform reference computations with accurate computational chemistry methods a new fragmentation approach was implemented. Based on these extensions, a potential for the carbon dioxide conversion capable of modeling the complex reaction dynamics at unprecedented accuracy is currently being finalized.
The main results achieved by MachineCat encompass the development of the SchNetPack code package, not only suitable for constructing models but also for simulation purposes and as a development tool for researchers. The studies of organic reactions and carbon dioxide conversion in particular have shown, that ML approaches are able to exceed the limits of conventional methods and yield predictions close to experiment. Over the course of MachineCat, four fundamentally new ML models were developed in the form of FieldSchNet, SchNOrb, g-SchNet and SchNarc, each opening new venues for research. FieldSchNets ability to model solvent effects will be exploited together with BASF as industrial partner as part of the BASLEARN project.
The research undertaken in MachineCat has so far resulted in four publications in peer reviewed journals (three of them in high impact journals), two bookchapters and one publication at NeurIPS, the leading ML conference. Posters were presented at NeurIPs and a joint workshop (BBDC, BZML and RIKEN). The researcher gave talks at 3 workshops (IPAM, UniSysCat), one at the annual conference of the Americal Phyiscal Socienty, three at seminars (host group, BasCat and UnisysCat), one as part of a visiting fellowship at the University of Warwick and one talk at the BASF headquarters in Ludwigshafen.