Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Realistic and Informative Simulations with machine learnING

Periodic Reporting for period 2 - RISING (Realistic and Informative Simulations with machine learnING)

Reporting period: 2023-09-01 to 2024-08-31

When astrophysicists want to study the universe, they can't just observe everything directly. Many celestial events and processes take millions of years to unfold and happen on scales that are simply too vast. To tackle this, scientists create simulations – specifically, gravitational N-body simulations. These simulations allow them to model the interactions of large numbers of stars or galaxies under the influence of gravity.

Astrophysicists often are after simulations that are as close to reality as possible. However, assessing the realism of these simulations isn't straightforward. Much of the time, it boils down to subjective judgment – which is unreliable and prone to bias.

Moreover, when you start a simulation, you need to set up the initial conditions very precisely. It's like starting a massive, complex video game where you need to decide every detail of the world before you press play. Right now, there's no quick and easy way to get these initial conditions just right without resorting to extremely complex and resource-intensive methods.

Finally, deciding on which simulations to run is a bit like choosing what experiments to perform in a lab. You have a ton of options, but you need to pick the ones that will tell you the most about the system you're interested in. Currently, this decision-making is based on educated guesswork, which isn't always the most efficient.

These issues can lead to simulations that are not as effective or accurate as they could be. This inefficiency has real-world consequences. High-performance computers used to run these simulations consume vast amounts of electricity, so running unnecessary simulations is wasteful.

The RISING project is divided into three main parts, addressing these issues one by one by means of suitable machine learning tools. These are at the core the same generative AI technology that powers AI art and chatbots, applied to simulations instead.

1. RISING::Realism – Here, I am developing new, objective ways to measure how realistic our simulations are. This involves creating tools that can compare the simulated universe with what we observe through telescopes. To do this I leverage deep learning tools, in particular anomaly detection performed by a dedicated generative adversarial network.

2. RISING::Genesis – This is about devising new methods for setting initial conditions without directly relying on hydro-simulations. The tools I am using come from machine learning: the goal of the game is to learn the probability distribution of positions, velocities, and masses of stars coming out of hydrodynamical simulations of star formation to obtain new realisation without the need to to rerun the original simulations.

3. RISING::Active – In this part, I use active learning, which is an intelligent way to automate the process of choosing which simulations to run. Instead of relying on guesswork, I use algorithms that learn from previous simulations and systematically guide us towards the most informative and useful ones.

Through the RISING project I am pushing the boundaries of what we know about the universe by making our simulations more precise, more efficient, smarter, and greener.
During my outgoing phase at Montreal University I extensively interacted with the group of Prof. Yashar Hezaveh and Prof. Laurence Perreault-Levasseur.
I became a member of CIELA, a new institute for computational astronomy founded in Montreal by Prof. Hezaveh, Prof. Perreault-Levasseur and others with the collaboration of Turing Prize winner Prof. Yousha Bengio.
I also joined MILA, formerly Montreal Institute for Learning Algorithms, the main AI institute in Canada.

During this period I authored or co-authored 25 scientific publications, including four contributions to workshops at major international conferences (ICML and NeurIPS) and 11 articles in international refereed journals.
Regarding RISING::Realism, I collaborated with NYU student Zehao Jin and NYUAD Prof. Andrea Valerio Macciò to make use of state of the art simulations and mock images from the NIHAO galaxy simulation suite.
With Zehao we developed a method for comparing mock images from simulations to real SDSS images using an anomaly detection deep learning architecture based on generative adversarial network.
Preliminary results were presented by me at a conference held by the center for computational astrophysics at the Flatiron institute in New York in the form of a poster and for the basis for a paper that is currently under review by The Astrophysical Journal.
A byproduct of this work, namely a cycleGAN model trained to translate mock images from simulations into photorealistic galaxy images and vice-versa formed the basis for an outreach project at the dedicated website 10nebulae.art. Note that this predated the recent flood of generative AI art models by more than one year.
Regarding RISING:Genesis, I devised a method to generate new realisations of initial conditions for star cluster simulations using an approach based on hierarchical clustering. This is not a deep learning approach and is intended to serve as a baseline for later work. I have concurrently worked on using diffusion models (the same technology underlying famous AI image generators such as Stable Diffusion) on point clouds to the same effect and I will soon release the results of this work in terms of an app on the RISING website and as part of a publication. Moreover, I co-supervised a master's student at Padua University, George Prodan, who graduated with a thesis on using generative models to create star cluster initial conditions.
Finally, in the context of RISING::Active I have explored the application of active learning in a simplified setting, that of the three-body problem. This led to a contribution that was accepted by the NeurIPS workshop ML for physical sciences, to be held in New Orleans on Dec. 15 2023, which includes the results of my supervision of a summer intern at Montreal University, Nicolas Payot.

As far as outreach is concerned, I engaged with the community through my website and personal blog, and via novel forms of online communication, such as by participating in the prediction market website 'manifold.markets' with a focus on predicting the future development of AI. I also participated in organizing two hackatons that brought ~50 perspective interns to Montreal for a week to work on an intensive hands on astro plus machine learning project.
RISING broke new ground in several respects: RISING::ACTIVE revealed a crucial limitation of active learning in dealing with classification on chaotic systems (such as most simulations in astronomy) that may present fractal decision boundaries. RISING::REALISM introduced a new way to judge the realism of cosmological simulations without relying only on scaling laws. RISING::GENESIS resulted in code that can produce realistic initial conditions for star clusters far beyond the current state of the art while relying on a limited number of hydrodynamical simulations of star formation.

In terms of career impact, I have been offered a permanent position at INAF (expected to start after the end of my incoming phase). This would have been unlikely to happen had I not leveraged my MSCA experience to gain experience in applied machine learning, since the position selection involved application of machine learning techniques to astronomical data as a main criterion.
Active learning being misled by the fractal decision boundary on the Sitnikov three-body problem