European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS

Big Extragalactic Surveys Treated Through Innovative MEthods

Periodic Reporting for period 1 - BESTTIME (Big Extragalactic Surveys Treated Through Innovative MEthods)

Okres sprawozdawczy: 2020-06-26 do 2022-06-25

Astrophysics is at a tipping point where fundamentally new science can be empowered by applying machine learning techniques to large surveys with billions of galaxies and petabytes of data. This combination forms the cornerstone of the MSC project BESTTIME (Big Extragalactic Surveys Treated Through Innovative MEthods)
A clear example of such a “data mining” approach is the Sloan Digital Sky Survey (SDSS), which provided photometric and spectroscopic information for local galaxies over one third of the sky. Over the last two decades, precious evidence to understand galaxy evolution came from SDSS thanks to such a wide area, resulting in a robust analysis that leverages large-number statistics. However, SDSS is limited to observe bright, nearby galaxies no further than a couple of billion light years from us. The exploration of the deeper cosmos (also meaning looking further back in time) has been conducted mainly by the Hubble Space Telscope (HST) with “pencil beam” surveys that pierced an extremely small area of the sky (<1 square degree in total).
The overall objective of BESTTIME is to fill this gap by studying hundreds of thousands of galaxies (like SDSS) very far away (at distances reached by the HST surveys). More in detail, we focus on the most massive galaxies (5 to 100 times the Milky Way) which are very rare compared to other types of galaxies and therefore scarcely present in previous studies. Understanding the evolution of these galaxies is one of the most compelling issues in Astrophysics: how did they build up their mass so quickly? What prevented them to form new stars after only a few Gyr?
By collecting unprecedented statistics for this kind of objects, the project aims at shedding light on both internal and “environmental” processes driving their characteristic growth, i.e. the formation of such unusually large number of stars in a relatively short period of cosmic time. With respect to the environmental processes, the pivotal goal is connecting these galaxies to the population of dark matter haloes hosting them, and on larger scales to the “cosmic web” of filaments that makes the fabric of the universe.
The importance of this project for society goes beyond the scientific progress in Astrophysics, as the machine learning methods developed within BESTTIME can be transferred to other domains such as Biomedical Sciences. For example, the software used here to analyse telescope images and broaden our knowledge about the universe’s life may be also applied to X-ray or MRI images. The statistical tools applied to our census of galaxies may turn out to be useful for the big databases in Public Health Sciences.
The project was divided in three working packages (WPs).

In WP1, my team and I devised an original machine learning code as a compelling alternative to standard tools used to estimate galaxy physical properties such as redshift (z), stellar mass (M*) and star formation rate (SFR). I built the code starting from the self-organizing maps (SOM) algorithm, using the latest observations in the COSMOS and SXDF fields and also state-of-the-art simulations for calibration and validation. An image showing the layout of one of the two fields (COSMOS) compared to the footprint of HST surveys, is attached to the present Summary. The SOM method is described and applied in two peer-reviewed publications, and has been presented in talks and seminars at different universities and research institutes. WP1 also included the creation of astronomical catalogs, containing optical and infra-red information for millions of galaxies. After being publicly released (with two related articles featuring high-impact scientific journals) these catalogs have been downloaded by 100+ astronomers (and counting).

In WP2 my collaborators and I constrained, for the first time, the stellar mass function (SMF) and the redshift-space 2-point correlation function (2PCF) of high-mass galaxies between z = 3 and 7, to characterize their growth and clustering in a regime that was poorly studied until now. The high-mass regime we probed allowed for a particularly valuable comparison with computer simulations, especially the ones reproducing galaxy evolution in big cosmological boxes since they require as observational counterpart a large galaxy catalog like the one built in WP1. Looking at the number density of massive systems up to high redshift, we found a remarkable degree of agreement between the latest theoretical predications and our data, both showing an excess with respect to more established models of galaxy formation. Such an intriguing feature suggests that latest physical recipes are correctly building up galaxies with up to 10^11 M⊙ less than 2 billion years after the Big Bang. The assembly efficiency is higher than what expected in the classical “AGN feedback” scenario and it is not the result of an ad-hoc calibration of the simulations’ parameters, since before this project there were no data available at the redshift/stellar mass regime for fine tuning. These results are presented in an article submitted for publication to a high-impact journal, and have been already presented during two international meetings.

The achievement of WP3 was the implementation of a “halo model” relating statistical properties of dark matter (in particular the halo mass function) to baryonic properties (especially M*). The model also allows to statistically distinguish central vs. satellite haloes, therefore it provides a more refined description of the universe than the SMF study. The results (presented in a peer-reviewed paper) support the scenario in which the peak of star-formation efficiency moves towards more massive halos at higher redshifts. The comparison with simulations showed that even when theoretical models correctly reproduce the 1-point statistics (see WP2) they still struggle to incorporate environmental mechanisms such as galaxy-galaxy interactions and the gravitational effects of dark matter in the massive regime. In fact, WP3 results now revealed a significant discrepancy as simulations generally have a higher contribution from satellites to the total stellar mass budget than our observations. This means that the feedback mechanisms acting in group- and cluster-scale haloes are likely to be less efficient in quenching the mass assembly of satellites. This part of the project has been completed only recently and it is expected to display its full impact. Nonetheless, I have already been contacted by a few research teams interested in comparing their independent analysis with our findings.
BESTTIME also represented the starting point for further explorations via next-generation facilities, the work done so far contributed to the identification of interesting sources that will be observed with the James Webb Space Telescope (a proposal has been scheduled for Cycle-1 General Observations). The novel SOM methods and the exploited datasets are also aligned to the future perspective of Europe: indeed, the Euclid Consortium expressed interest in building on our deliverables. The inter-disciplinary potential of this MSC project, also mentioned above, elicited spin-off collaborations with Computer Scientists and researchers in Public Health which are presently ongoing, with the goal of obtaining funds for joint supervision of MSc and PhD students.
The Cosmic Evolution Survey (COSMOS) compared to the Moon and Hubble deep fields.