The focus of the ML-MULTIMEM project has been advancing the application of machine learning (ML) methods in coarse grained (CG) molecular simulations. The project progressively addressed systems of increasing chemical complexity, and investigated the emerging challenges in this nascent field. We conducted an extensive investigation of the effect of model hyperparameters and loss function definition for the development of ML CG force fields for 5 different systems: liquid benzene mapped with 1 CG bead per molecule, liquid benzene mapped with 3 CG beads per molecule, polyethylene mapped with 1 CG bead per monomer, polyethylene mapped with 2 CG beads per monomer, PIM-1 mapped with 3 CG beads per molecule. Despite significant challenges associated with hyperparameter optimization, a combination of settings that yielded satisfactory results was identified for each system. For acceptable models, temperature and size transferability tests were also performed, and consistent behaviour was observed.
It was found that multiple criteria need to be evaluated in order to identify a suitable model, and these criteria are not only related to training metrics, but also to simulations performed with the trained models. Particularly important is the definition of the components in the loss function and their relative weight during the training. In this regard, a dedicated study of self-adaptive methods for the determination of loss function coefficients was conducted, and a statistically-grounded scoring procedure was proposed to evaluate different methods and identify the best one.
Open-source codes (SchNetPack) were extended in order to enable the study of macromolecular systems by incorporating connectivity, particle typing, and molecule membership information, allowing to discriminate inter- and intramolecular particle neighbours. Moreover, the developed models were interfaced with popular open-source molecular dynamics codes (LAMMPS). These contributions to open-source projects allow a broad diffusion and exploitation of the project results.
The project results were widely disseminated to the research community through the publication of 2 conference papers and the participation to diverse scientific congresses, with focus on artificial intelligence, materials science, and engineering, promoting interdisciplinary knowledge transfer. The work was also presented in several seminars and invited talks for national and international audiences. Moreover, the project team has collaborated to the organization of the scientific workshop “AI in Natural Sciences and Technology (AINST)”. Outreach actions allowed interaction with the general public, especially pupils and high-school students.