Periodic Reporting for period 1 - MachineCat (Machine Learning for Catalytic Carbon Dioxide Activation)
Reporting period: 2018-06-01 to 2020-05-31
The main objective of Machine Learning for Catalytic Carbon Dioxide Activation (MachineCat) was to deepen our understanding of ML in computational chemistry and use this knowledge to push the boundaries of existing approaches. To this end, MachineCat studied chemical problems which prove challenging to current ML methods, focusing on the organocatalytic conversion of carbon dioxide mediated by a modified chitosan catalyst. This reaction is highly relevant for sustainable chemistry, as it offers cheap access to value-added chemicals, potentially replacing fossil fuels as primary carbon source. Yet, little detail is known on how the reaction proceeds and one objectives of MachineCat was to use ML approaches to elucidate the reaction mechanism. As a final objective, MachineCat aimed to explore the potential of ML methods for the rational design of new compounds and improved catalytic systems.
The conclusions found in MachineCat demonstrate the utility of modern ML architectures beyond providing efficient and accurate models of complex chemical systems. By incorporating physical relations into the structure of these ML models, their predictions and internal states can be readily understood in the context of fundamental chemical concepts, such as atomic charges and orbitals. Moreover, physical laws can be integrated in such a way, that the rich formalism of quantum mechanics can be applied directly to these ML models. This offers access a vast range of chemical properties and even the molecular wavefunction itself. Such models provide a direct relationship between chemical structure, composition and properties, which can be leveraged to design compounds with desirable qualities. Finally, by incorporating ideas from the field of ML, it is even possible to construct models which can directly generate the structure and composition of novel compounds.
The main results achieved by MachineCat encompass the development of the SchNetPack code package, not only suitable for constructing models but also for simulation purposes and as a development tool for researchers. The studies of organic reactions and carbon dioxide conversion in particular have shown, that ML approaches are able to exceed the limits of conventional methods and yield predictions close to experiment. Over the course of MachineCat, four fundamentally new ML models were developed in the form of FieldSchNet, SchNOrb, g-SchNet and SchNarc, each opening new venues for research. FieldSchNets ability to model solvent effects will be exploited together with BASF as industrial partner as part of the BASLEARN project.
The research undertaken in MachineCat has so far resulted in four publications in peer reviewed journals (three of them in high impact journals), two bookchapters and one publication at NeurIPS, the leading ML conference. Posters were presented at NeurIPs and a joint workshop (BBDC, BZML and RIKEN). The researcher gave talks at 3 workshops (IPAM, UniSysCat), one at the annual conference of the Americal Phyiscal Socienty, three at seminars (host group, BasCat and UnisysCat), one as part of a visiting fellowship at the University of Warwick and one talk at the BASF headquarters in Ludwigshafen.