Skip to main content

Novel Evolutionary Model for the Early stages of Stars with Intelligent Systems

Periodic Reporting for period 1 - NEMESIS (Novel Evolutionary Model for the Early stages of Stars with Intelligent Systems)

Reporting period: 2021-03-01 to 2022-02-28

NEMESIS has the ambition to reshape our understanding on the formation of stars by employing artificial intelligence methods to interpret the largest, panchromatic data collection of young stellar objects. Recent evidence suggests that planets form synchronously rather than sequentially to their host stars, indicating a rapid early evolution of star-planet systems. To ascertain these timescales, it is necessary to first determine the characteristic transitions that describe each phase of star formation. The definition of classes for young stellar objects was made possible more than 30 years ago, due to the first space-based infrared sky surveys. Whilst successful in determining global properties, current classification is prone to large uncertainties, and therefore, timescales, which are based on population statistics among different classes in a steady-state evolution, remain dubious.

NEMESIS aims to readjust the current classification scheme and its characteristic timescales so that it is concurrent with the most recent observational and theoretical constraints. To meet these goals NEMESIS will compile the largest, panchromatic dataset comprising of all young stellar objects in nearby star-forming regions, harnessing critical information that resides in data from space missions. It will reprocess and analyze this unique dataset with supervised and unsupervised machine learning algorithms, deep learning neural networks for object detection, clustering and regression analysis of images in order to advance the analysis and interpretation beyond the current state-of-the-art. Ultimately, NEMESIS brings big data techniques and hybrid machine learning methods to systematically analyze and interpret large data volumes in order to answer some of the most persisting questions, paving the path toward data-intensive science applications in modern astrophysics.
NEMESIS is designed to evolve in three main phases; this report covers Phase I (kick-off), which evolved during the first year of the project. Beyond science, Phase I included setting up the team and all related infrastructure/channels that enable a close collaboration of the team members. Moreover it involved establishing the major channels to communicate the project activities and to disseminate its results, but also reach a wider network of collaborators which would help accelerate NEMESIS objectives and their reach to the wider community. To this end, Audard & Dionatos got approved a proposal to form an ISSI team, bringing experts from diverse disciplines in astrophysics, astro-informatics and machine learning. The major objectives of the team are to assist defining (i) best machine learning methods and (ii) most descriptive datasets toward a new YSO classification.

Being an important pillar of the project, data compilation was initiated immediately. At a first stage catalogued data for nearby star-forming regions were retrieved. For young stellar objects, infrared wavelengths are particularly important, therefore data from space-borne infrared facilities (e.g. Herschel, Spitzer, AKARI, WISE) were given a priority. Nonetheless, data spanning all over the electromagnetic spectrum either from space (e.g. Hubble, XMM-Newton, Chandra) or ground-based facilities (e.g. ALMA, APEX, JCMT, 2MASS etc) can provide important information on the evolution of YSOs and were therefore retrieved. Part of the data compilation is based on reduced/published data which was retrieved from the literature and/or databases, while data of specific interest are being freshly reduced by the NEMESIS team.

Aiming to accelerate the production of early results, we prioritised the data compilation for a single star-forming region: Orion. The selection was based on both the number of young stellar sources, with Orion being the largest nearby star-forming region, but also on the number of available data, since Orion is one of the best studied star-forming regions. The Orion data compilation allowed us to perform a number of test different machine learning methods on the actual data and evaluate their performance.
The power of machine learning techniques is not limited to the analysis of the data compilation; instead the techniques are applied in many different components of the project, including the reduction of specific data. The reduction of Herschel data, and in particular the compilation of high quality photometric point source catalogues are a key component of NEMESIS but also an important heritage result of the project. Heschel has been the largest and most advanced far-infrared facility todate, a fact that is expected to remain so until at least the mid-2030’s. In this respect, Herschel data will remain the best far-IR data available to science for the foreseeable future and therefore high-quality end products will represent an important service to the community.

With NEMESIS we introduce big data and machine learning techniques in the field of Star Formation to an extent that was never attempted before. Aiming to remain in the center of attention in a swiftly advancing field, we are organizing meetings in the context of the largest conferences taking place this year in Europe. These include the annual meeting of the European Astronomical Society (EAS) and the scientific assembly of the Committee on Space Research (COSPAR) hosting as invited speakers some of the leading figures in the field.