Periodic Reporting for period 1 - ML-MULTIMEM (Machine Learning-aided Multiscale Modelling Framework for Polymer Membranes)
Reporting period: 2021-11-15 to 2023-11-14
1. We developed a ML-based multiscale simulation strategy, bridging atomic and coarse-grained scales, for the study of macromolecular and organic systems at bulk conditions.
2. We incorporated the developed ML method into open-source packages and widely used simulation tools.
3. We utilized the developed strategy to simulate organic liquids and polymers of industrial interest (polyethylene, PIM-1), to showcase its application to real-world test cases.
This work holds societal significance due to the pervasive use of polymers in manufacturing, healthcare, energy, and environmental technologies. Improving our capacity to design polymers efficiently has the potential to catalyse breakthroughs in diverse sectors. The proposed ML-based approach offers a pathway to potentially increase efficiency and versatility of molecular modelling more generally, with broad implications for advancements in a multitude of industries and technologies.
It was found that multiple criteria need to be evaluated in order to identify a suitable model, and these criteria are not only related to training metrics, but also to simulations performed with the trained models. Particularly important is the definition of the components in the loss function and their relative weight during the training. In this regard, a dedicated study of self-adaptive methods for the determination of loss function coefficients was conducted, and a statistically-grounded scoring procedure was proposed to evaluate different methods and identify the best one.
Open-source codes (SchNetPack) were extended in order to enable the study of macromolecular systems by incorporating connectivity, particle typing, and molecule membership information, allowing to discriminate inter- and intramolecular particle neighbours. Moreover, the developed models were interfaced with popular open-source molecular dynamics codes (LAMMPS). These contributions to open-source projects allow a broad diffusion and exploitation of the project results.
The project results were widely disseminated to the research community through the publication of 2 conference papers and the participation to diverse scientific congresses, with focus on artificial intelligence, materials science, and engineering, promoting interdisciplinary knowledge transfer. The work was also presented in several seminars and invited talks for national and international audiences. Moreover, the project team has collaborated to the organization of the scientific workshop “AI in Natural Sciences and Technology (AINST)”. Outreach actions allowed interaction with the general public, especially pupils and high-school students.
The project has tackled the study of organic and macromolecular systems at bulk conditions, whereas previous literature reports mostly focused on isolated macromolecules for the development of ML CG force fields. This required the extension of the methods to consider also connectivity, particle typing, and molecule membership information. The codes developed during the project, implementing the aforementioned extensions are shared as open source projects, to maximize impact and exploitation of the results.
The problem of learning using a multicomponent loss function, which constituted a clear need in the ML-MULTIMEM setting, was address by investigating self-adaptive methods for loss function coefficients determination. Moreover, a ranking procedure for comparative analyses of different methods was developed. Even though it was tested here in the context of a natural science application, the problem of multicomponent loss learning is a general one, and therefore the proposed method can be applied in a broad variety of settings.