Periodic Reporting for period 1 - HYDRO-CLUSTER (Micro-Hydration of Atmospheric Molecular Clusters)
Okres sprawozdawczy: 2023-06-01 do 2025-05-31
Secondary aerosol formation begins with the creation of molecular clusters via collisions between atmospheric molecules, including water. These clusters can then grow into stable aerosol particles. However, the exact role of water in the early stages of new particle formation (NPF) remains poorly understood. This is mainly due to experimental limitations in detecting water in small molecular clusters; particularly with widely used instrumentation such as chemical-ionization atmosphere-pressure-interface time-of-flight mass spectrometry (CI–APi–ToF MS), in which water typically evaporates before detection. Furthermore, humidity can influence the stability of these clusters and alter particle formation rates by more than two orders of magnitude. Despite water being one of the most abundant atmospheric molecules, its role in new particle formation remains largely unresolved.
The primary objective of this MSCA research fellowship is to establish a robust theoretical framework for accurately modelling the role of water in atmospheric new particle formation. This will be achieved by developing advanced machine-learning (ML) methods trained on high-level quantum chemical (QC) data. These ML models will enable efficient, large-scale molecular dynamics simulations to reveal the detailed dynamics and thermodynamics (e.g. evaporation and fragmentation) of molecular clusters containing water. Moreover, in this work, we want to go even beyond the state of the art and question whether the statistical thermodynamics used on top of quantum chemistry calculations really well-describes weakly bound molecular clusters such as those containg water as water typically bind weakly compared to other strongly binding new-particle formation precursors.
In summary, this research will deliver new theoretical insights and computational tools to tackle one of the central unresolved problems in atmospheric science, with broad implications for climate modelling, public health, and evidence-based policymaking.
Additionally, we mapped the potential energy surfaces (PES) of small hydrated clusters, identifying limitations in traditional structure comparison techniques (e.g. RMSD). We introduced a kernel-based similarity metric (modified FCHL kernel implemented within QML) that more reliably distinguishes cluster configurations. These studies reveal shallow energy barriers between conformers, indicating rapid interconversion and thus enhanced thermodynamic stability through entropic effects. This, of course, highlighted the enormous need for a different technique that would capture the anharmonic vibrations and fast interchangeability between the numerous minima. Therefore, we searched for dynamics technique to capture these trends.
Machine learning (ML) models have been initially trained on sulfuric acid–water clusters and were further extended to other systems. While we promised to use a Quantum Machine Learning (QML) model, we in the end used a neural network (NN) model (specifically PaiNN implemented in SchNetPack) for its ability to rapidly model both energies and forces. These models were used to accelerate molecular dynamics (MD) simulations, particularly umbrella sampling, to accurately capture the thermodynamics of binding and reaction pathways.
Finally, we identified systematic errors in the statistical thermodynamics corrections commonly applied to QC data and demonstrated improved approaches for evaluating free energy differences in hydrated clusters. We still hope to improve the above-mentioned NPF models with these new outcomes.
- First-of-its-kind benchmark of hydration effects in acid–base clusters:
We performed systematic configurational sampling and quantum chemical benchmarking for a wide variety of hydrated atmospheric clusters. The resulting dataset is significantly more comprehensive than previous studies and provides a reliable reference for validating both theoretical models and experimental interpretations. This includes not only electronic energetics but also entropic contributions and structural dynamics.
- Publicly available tools and databases for the scientific community:
We released automated sampling scripts (the JK framework) and maintained the updated Atmospheric Cluster Database (ACDB 2.0) providing valuable infrastructure for the broader community. These resources enable researchers to build upon our findings and integrate hydration effects into diverse modeling frameworks, including those used by computational chemists, atmospheric modelers, climatologists, and environmental scientists.
- Machine learning tools tailored for molecular clusters:
We demonstrated that traditional similarity metrics (e.g. RMSD) often fail to capture the relevant structural complexity of hydrated clusters, while chemically informed kernels (such as FCHL) offer superior insight. Furthermore, we developed and trained ML models (e.g. PaiNN) on quantum chemical data, allowing for fast and accurate modeling of hydrated cluster dynamics. These models enabled extensive umbrella sampling molecular dynamics simulations, which would otherwise be computationally prohibitive. Through these simulations, we also identified systematic errors in traditional statistical thermodynamics applied to quantum chemistry, highlighting a significant methodological limitation. The ML-enhanced simulations form a bridge between molecular-scale dynamics and mesoscale aerosol models, supporting more realistic atmospheric modeling.
- Molecular-mechanistic insight into hydration-enhanced nucleation:
We found that many atmospheric acid–base molecular clusters are not significantly hydrated, contrary to some assumptions in the literature. However, for specific strongly bound systems (e.g. those involving sulfuric acid, methane sulfonic acid, and certain bases), humidity substantially affects nucleation pathways. While the overall enhancement factors align with previous studies, the mechanisms we identified differ markedly. Importantly, we also challenged the reliability of traditional statistical thermodynamics used in combination with quantum chemistry. Our umbrella sampling approach revealed significant discrepancies, suggesting that widely accepted methods may be flawed when applied to flexible, hydrated systems. This finding significantly advances the state of the art and warrants further investigation across the computational chemistry community.
Future Uptake and Needs
To ensure maximum impact and broader uptake of our results, we identify the following priorities:
- Further research: ML-enhanced umbrella sampling offers a powerful and more experimentally consistent method for assessing the thermodynamics of flexible systems. However, it is not yet widely adopted and should be used with care. Broader community testing and development are needed.
- Integration into climate models: We continue to collaborate with large-scale atmospheric modelers, providing new data to improve parameterizations used in climate and air quality simulations.
- Scientific networking and dissemination: Participation in interdisciplinary events (e.g. CECAM, EGU, ACTRIS) has helped establish a transnational network of researchers in particle formation. Continued engagement in such forums is critical for knowledge transfer and expanding impact.
- Publications and collaborations: At least two major publications are nearing submission, and several collaborative projects initiated during the fellowship are expected to lead to additional outputs.