European Commission logo
italiano italiano
CORDIS - Risultati della ricerca dell’UE
CORDIS

Herschel Extragalactic Legacy Project

Final Report Summary - HELP (Herschel Extragalactic Legacy Project)

Executive Summary:
The key achievements of HELP have been to (a) assemble a vast array of high quality astronomical data from a wide range of telescopes and observatories over the same 1,300 deg2 area of the sky (b) merge these to produce a master list of sources in a common format (b) calculate new photometric redshifts with an hierarchical Bayesian methodology that incorporates the best from many methods (d) develop a new fully Bayesian methodology, XID+, for the photometry for low resolution (far infrared and sub-mm data) that is extendable and able to exploit a wide range of a priori information (e) provide new estimates of key physical parameters including star formation rates and stellar masses (f) have done all of the above using a pioneering Open Science approach with fully public methods including data, software and full work-flows.
Project Context and Objectives:
High level overview

The Herschel Extragalactic Legacy Project (HELP)’s main objective was to provide a rich new data set characterising the physical properties of hundreds of thousands of distant galaxies. This brought together a vast range of data from many different astronomical observatories. The focus was on the images produced by ESA’s Herschel mission. These images chart the star formation enshrouded in dust and our work will allow the entire astronomical community to unlock the full potential of those images.

Concept & Objectives

How did galaxies form and evolve? This is one of the biggest and most challenging questions in astronomy. Although astronomers now have a good understanding of the background cosmology and of the formation of the large-scale structure of the dark matter, the complex astrophysics that leads to the variety and numbers of galaxies observed within the dark-matter halos is still very poorly understood.
Limited clues to these questions can be found through focussed studies of individual galaxies. However, the fundamental requirement for rigorous testing of any theories of galaxy formation and evolution is a complete statistical audit or census of the stellar content and star formation rates of galaxies in the Universe at different times and as a function of the mass of the dark matter halos that host them.
This audit requires many elements. We need unbiased maps of large volumes of the Universe made with telescopes that probe the different wavelengths at which the different physical processes of interest manifest themselves. We need catalogues of the galaxies contained within these maps with photometry estimated systematically and consistently from field-to-field, from telescope-to-telescope and from wavelength-to-wavelength. We need to understand the probability that a galaxy of given properties can appear in our data sets. We need the machinery to bring together these various data sets and calculate the “value-added” physical data of primary interest, e.g. the distances (or redshifts), stellar masses, star-formation rates and the actual number densities of the different galaxy populations.

Our project brought together teams that have been undertaking ambitious coordinated multi-wavelength programmes to study large volumes of the distant Universe to achieve this. During the period of this grant these surveys have become significantly complete and we have been able to undertake the necessary homogenisation and additional value, to provide a comprehensive census of the galaxy populations in the distant Universe.

Project Results:
Value added to Herschel Mission and coordination with space and ground-based surveys

Our goal was to provide the community with a collection of data, value-added products, and tools that will become the reference for astronomers studying large samples of galaxies over representative volumes of the high redshift Universe. Comparative projects are SDSS, COSMOS and GAMA. Those projects are exemplary though more limited in terms of depth or sky area. We believe that the quality of the products that we have delivered here to the community match or exceed those. A driving philosophy for HELP has been a principle of Open Science, i.e. we have provided the data, software and workflows freely and transparently so that the community can reproduce and extend any of our analysis. Specifically, we have released tools that undertake all the methods we have used in order to derive our products. This means, for instance, that the community are able to take their own data, and get an estimate, not only of the Herschel fluxes for their sources, but also the probability distribution of those.

Value added to Herschel Mission and coordination with space and ground-based surveys

Our goal was to provide the community with a collection of data, value-added products, and tools that will become the reference for astronomers studying large samples of galaxies over representative volumes of the high redshift Universe. Comparative projects are SDSS, COSMOS and GAMA. Those projects are exemplary though more limited in terms of depth or sky area. We believe that the quality of the products that we have delivered here to the community match or exceed those. A driving philosophy for HELP has been a principle of Open Science, i.e. we have provided the data, software and workflows freely and transparently so that the community can reproduce and extend any of our analysis. Specifically, we have released tools that undertake all the methods we have used in order to derive our products. This means, for instance, that the community are able to take their own data, and get an estimate, not only of the Herschel fluxes for their sources, but also the probability distribution of those.
Such scientific products have life expectancy going well beyond their development. The products listed hereafter will be a valuable and unmatched resource for astronomers for the foreseeable future as no other planned instruments will be unable to replicate Herschel’s wavelength coverage. The facilities involved in this project span wavelengths from the UV to the radio, from small and deep to wide and shallow fields. Through this wide scope, this project has the ability to reach a large fraction of the community interested in extragalactic astronomy. All the products (catalogues, templates) and tools developed by the team have been distributed to the community providing a rich and lasting legacy.

An optical and near-infrared “master list” of unprecedented complexity and size

The HELP project was defined to be those regions (1,270 deg2) of the celestial sky that had been observed by Herschel in surveys designed to understand extra-galactic populations. By design those surveys had been undertaken in fields that maximised the available data from other telescopes. The HELP fields thus contain the best and richest astronomical data sets for extra-galactic science. A key achievement of HELP has been to assemble, collate, and homogenise the catalogue data from the many optical and near infrared surveys that have been undertaken into one single “master list”. This work brought together 60 separate surveys and 500Gb of catalogue data and produced a master list of 171.6M galaxies (an average surface density of 135k galaxies deg-2). For all of this the workflows that were used to merge the catalogues are public so that the outcome is reproducible and the provenance of any data is clear. Our scientific validation of these products, including diagnostic plots etc. are also made public (integrated with the software to do so).

Ultimate exploitation of Herschel-SPIRE data and other far infrared data

The Herschel-SPIRE instrument has performed exceptionally and high fidelity maps can be produced with relative ease. Through HELP we have made some of this software publicly available. The fast mapping speed of SPIRE has enabled surveys around 1300 deg2 with sensitivities close to the confusion limit. This survey has produced catalogues of hundreds of thousands of sources. We have assembled the catalogues produced by the legacy survey teams. We have also extracted new catalogues of galaxies homogenously over ~1000 deg2.

These catalogued sources account for 15 per cent of the cosmic infrared background. However, much more can be gained from the maps directly. Fluctuations in the deepest images account for ~60 per cent of the far-infrared background at SPIRE wavelengths.

The full interpretation of any Herschel SPIRE data ultimately requires association with ancillary data at other wavelengths in order to understand other fundamental properties such as distance, stellar content or AGN content.

HELP has thus developed a new, extendable, tool, XID+, to exploit the ``confused’’ map data and extract the maximum amount of useful information by modelling the far infrared fluxes of galaxies with some properties known a priori from the ancillary data. This tool enables new science and the exploitation of the Herschel observing time to its maximum. This tool can be run by astronomers and allows them the freedom to define the important “prior” constraints on the modelling appropriate for their science goals.

The XID+ tool is computationally expensive to run, so we have applied this with a generic prior to provide fluxes from Spitzer MIPS 24micron, Herschel PACS and Herschel-SPIRE for a sub-set of objects within our master-list. This currently provides fluxes for 807k galaxies over 544 deg2 and work continues to increase this sample.

Lowering the barriers to statistical survey science

Much of the science from extragalactic surveys comes from statistical studies of the populations. To maximise the statistical sample size requires joining together different data from different sources. HELP have homogenised the data from different surveys to make the differences between the surveys transparent to the user, and provided the software so this can be repeated for new data. Scientific studies also require good knowledge of the selection effects that have gone into the making of those products. Often an understanding of those selection effects is harder to obtain than the basic data, meaning that this science can only be achieved by the original survey teams.

We have developed methods to define the selection functions from public data and provide these consistently over many different data sets enabling more science to a wider community.

Value-added survey data products and tools.

The basic data that come from telescopes includes the fluxes and positions of the galaxies. HELP provides a wide range of additional information to expand and enhance the basic data, including photometric redshifts based on wide variety of templates and physical parameters for each galaxy (star-formation rates, stellar masses, dust masses etc.).
These value-added products will make it much quicker and easier to exploit the many data sets.

Redshifts

Spectroscopic redshifts have been gathered from many surveys over all 1,270 deg2. These have been collated, removing duplicates and to a simple consistent format with homogenised quality information. 849,863 redshifts were collected resulting in 755,796 unique redshifts, 607,119 deemed to be reliable.
Photometric redshifts have been computed using a new hierarchical Bayesian approach which combines the estimated redshifts arising from different templates sets in a justifiable probabilistic manner. This results and the characterisation of the errors in PDF form have been calibrated using our spectroscopic compilation. This has been carried out for 14 HELP fields totalling 1221 square degrees (96% of HELP area). This is a total of 92,190,306 objects, with an average of 67% success rate for master list sources in a given field and 54% of the total number of optical master list sources in the HELP database.

CIGALE modelling

Physical properties of the galaxies detected by Herschel have been estimated with the Code Investigating GALaxy Emission (CIGALE) developed in the Laboratoire d’Astrophysique de Marseille. CIGALE is designed to estimate the physical parameters by comparing modelled galaxy spectral energy distributions to observed ones. CIGALE conserves the energy balance between the dust-absorbed stellar emission and its re-emission in the IR.

The SED fitting process was optimized in two steps: prior to the availability of HELP data the method was first developed and optimized for IR bright galaxies. Then nine HELP fields have been studied. 855 651 spectral energy distributions with ~20 photometric bands each have been analysed. We used the same strategy for all of them to obtain a homogeneous output of estimated physical parameters.
We delivered four main physical properties of galaxies: stellar mass, star formation rate, and two dust luminosities: one based on the full spectra SED fitting, and the second estimated based on the IR measurements only. We also delivered several quantities which describe the quality of the fits. All the physical parameters were obtained from the probability distribution function (PDF) of each parameter given by CIGALE.

The analysis of dusty galaxies have led to re-visit the process of dust attenuation in galaxies and its effect on the UV to NIR spectral energy distributions of galaxies, new flexible recipes have been proposed to represent the variety of physical conditions at work in galaxies.

CYGNUS Radiative Transfer modelling

We have developed large libraries of radiative transfer models for the AGN torus assuming a tapered disc geometry, starburst models and spheroidal galaxies. The models for spheroidal galaxies assume that stars, dust and molecular clouds are mixed in a distribution that assumes a Sersic profile. Both the starburst and spheroidal galaxies incorporate the stellar population synthesis model of Bruzual & Charlot. In the case of the spheroidal galaxies, a different library is computed at a number of representative redshifts in the range 1-5 to take into account the age of the galaxy.

We have also demonstrated that the libraries can be efficiently fed into MCMC SED fitting routines to fit galaxies with a range of types and extract physical quantities of interest such as star formation rate, stellar mass, AGN fraction etc. We therefore anticipate that these models will be especially useful for analysing the HELP data and other data assembled with future missions.

Expanded use and relations with established international space powers

We add value to past NASA missions, primarily Spitzer, but also GALEX and WISE. Our supporters include the PI of SWIRE, deputy PI of SpuDS. HELP have a close relationship with the SPITZER SERVS/DeepDRILL team (the PI is on the HELP Science Advisory Board).

We have a partnership with ISAS/JAXA that provides direct involvement on Herschel observations of AKARI deeps fields (South and North Ecliptic Poles, SEP and NEP). The AKARI SEP field, the deepest field observed with the Far Infrared Surveyor instrument, is one of the HELP fields.

Fields at high ecliptic latitude are becoming key extragalactic and cosmological fields on which the future Euclid mission is expected to concentrate its deep investigations. The Euclid images will also have to be coupled to ancillary datasets of different resolution and the sophisticated methods developed for HELP are expected to be very useful in performing the best associations.

Looking to the future, the JAXA/ESA mission SPICA is expected to provide the community with a huge advance in Far-IR spectroscopic sensitivity. Deep and wide Far-IR observations are clearly useful to provide interesting targets for SPICA/SAFARI to follow-up. So HELP is very much a mandatory target finder for SPICA. Additionally, since SPICA/SAFARI will also deeply scrutinise the sky in the 35--210 µm spectral wavelength range, the techniques of source photometry in confused data and in relation with other surveys developed here will have immediate parallel applications in the SPICA mission.

XID+ analysis of synthetic data is already being used to help test and improve the science case and thus design requirements of the Galaxy Evolution Probe (a concept for a far-infrared survey observatory)

The Atacama Large Millimetre / sub-millimetre Array (ALMA) and the James Webb Space Telescope (JWST) both have small instantaneous fields of view. Most programmes on these facilities will be a follow-up of pre-identified interesting targets (rather than objects discovered by those facilities themselves). The same is true for the ESO Extremely Large Telescope project, ELT, and similar facilities planned around the world. The HELP catalogues provide a high quality atlas of the distant Universe to provide targets for these facilities. A particular example is a new catalogue of galaxies, selected to be “red” in their far-infrared colours, a high fraction of these will be at high redshifts z>4
Potential Impact:
The potential impact of the HELP project falls under three main categories: Academic Impact, Public Engagement and Other Socio-economic impact. We’ll briefly discuss these.

Academic Impact

The primary impact of HELP will be on the academic research community. We have adopted a traditional approach to engaging our academic audiences through publication of research papers, attending and speaking at conferences and workshops and giving talks on HELP. The Open Science nature of the project is specifically designed to enable a long-last legacy in which the project is developed by the community.
Towards the end of the project when the scale and nature of the data deliverables became clear we defined a communication plan. This identifies the key messages about the benefits of HELP that we want to communicate to our academic research audience. We have then defined how we can get those messages across through www pages, key HELP presentations slides, engagement with specific science project survey teams (including through our contacts on the Scientific Advisory Board) and a key survey paper to be submitted to Monthly Notices of the Royal Astronomical Society and a companion article in the Astronomy & Geophysics newsletter. We have already seen an uptake of the XID+ code by the team that are developing a NASA mission concept the Galaxy Evolution Probe http://adsabs.harvard.edu/abs/2018AAS...23112102G. A UWC partners of HELP have also established a spin-out project HELP-IDIA Panchromatic PrOject (HIPPO) to continue and expand the work of HELP, specifically for the next generation of radio surveys http://mattiavaccari.net/hippo/.

Public Engagement

Considerable effort on public engagement was undertaken by Jillian Scudder who worked on HELP but was funded by STFC. Jillian has a wide blog and twitter following specifically on her https://astroquizzical.com/ blog, some of the public events she did are listed here https://www.jillianscudder.com/outreach/. Her HELP paper “The multiplicity of 250- μm Herschel sources in the COSMOS field” was accompanied by a press release and was picked up by 21 news outlets see https://www.altmetric.com/details/7118856. We began developing a project to produce a Herschel planetarium show with a company TeqQ4, though this is currently on hold. A press release accompanied the HATLAS data release papers.

Other socio-Economic impact

We realised that a lot of potential benefit of the HELP project came through the data analysis skills and techniques that the researchers had and developed during the project. This was particularly through the Hierarchical Probabilistic modelling and Bayesian inference in XID+. This spawned a lot of activity in transferring data science into other domains. Key successes include:

ASTRODEM

The ASTRODEM project aims to create a predictive model which will help general practitioners (GPs) identify patients at high risk of dementia. University of Sussex astrophysicists including Oliver will “swap galaxies for general practice patient data” in an innovative new study, in collaboration with researchers from Brighton and Sussex Medical School. ASTRODEM is funded by a Wellcome Trust Seed Award in Science.
Find out more about the project through the WWW site https://www.bsms.ac.uk/research/primary-care-and-population-health/health-informatics/astrodem/index.aspx

AstroCast

AstroCast is a project funded by the UK’s STFC is led by Seb Oliver at University of Sussex.
Livestock accounts for 37.5% of Kenya's land area, 12% of its GDP and 40% of its agricultural sector, but is susceptible to frequent droughts and periods of overgrazing. In this context, this project will assess the potential of new Earth observation datasets to deliver near real-time monitoring and prediction of useful and accessible biomass for pastoralism.

Drought and flood events are a major threat in sub-Saharan Africa (SSA) causing substantial losses of life, assets and livelihoods, and weakened national economic performance. Hazard early warning and disaster risk preparedness actions can be effective in reducing these losses (as much as 20 times more effective than post-disaster relief). In this project we will apply advanced data analysis techniques used in astronomy to facilitate improved hazard early warning models in Kenya.

This pilot project brings together STFC funded Astronomers at University of Sussex with a strong track record in data analysis with a world-leading interdisciplinary team in Climate Change and Developmental studies.

Several global to regional pasture monitoring systems exist that are based on Earth observation data, which are used in early warning systems. These systems tend to rely on coarse resolution data (250m - 8km) to provide near real-time information on vegetation health, and are combined with mechanistic models or expert knowledge to forecast seasonal outcomes. However, this spatial scale is unable to adequately distinguish pastures from scrublands, small farms and woody vegetation, and is unable to provide meaningful information on the onset of vegetation stresses. This project will use data from the Copernicus Sentinel mission (data provided every 5-12 days at 10-20m resolution) with key information on vegetation state.

The ultimate outcome of this research will support pastoralists communities Kenya, to decide the suitability and location of pastureland for their various livestock through: a) improved understanding of spatio-temporal distribution of pastures; b) improved understanding of ecological changes and resilience of pastures; and c) near-future predictions of pasture suitability. This will enhance their livelihood resilience in the wake of large and extensive droughts, overgrazing, and land cover change.

DISCUS

The Data Intensive Science Centre at the University of Sussex http://www.sussex.ac.uk/discus/.

Led by Seb Oliver, DISCUS is the Data Intensive Science Centre at the University of Sussex, a research unit built to address real social and economic challenges by applying data interpretation techniques developed by a cross-disciplinary team over a number of years.

DISCUS aims to support the UK’s public and private sector organisations as they seek to make better use of their largest and most complex data sets, delivering better outcomes for the general public, and staying competitive on the international stage

DISCnet

DISCnet https://www.discnet.org.uk/ is a new doctoral training centre funded by STFC with Seb Oliver as director. DISCnet capitalises on our existing long-term collaborative and business engagement experiences through the South East Physics Network, SEPnet. DISCnet is an STFC Centre for Doctoral Training, providing a platform upon which we can train a new generation of post-graduate data intensive scientists – around 60 PhD students over two initial cohorts. Our students will be trained in the latest skills required for the rapidly growing data economy including: programming, big data handling, data analytics, and the latest statistical and machine learning techniques that underpin artificial intelligence. These skills will be honed on some of the most challenging big data science questions in particle physics and astrophysics. Each student will undertake two 3-month placements in non-University environments undertaking a data intensive project.

DataJavelin https://www.datajavelin.com/

Peter Hurley who was one of the HELP project scientists and the architect of XID+. He has used these skills in various projects and now co-founded a spin-out data consultancy firm DataJavelin. DataJavelin creates cutting-edge data science and machine learning solutions to solve problems across a broad range of domains. Their team have an excellent record in applying machine learning techniques to cutting-edge academic and business problems.
List of Websites:
Public www site address: herschel.sussex.ac.uk
Primary Contact is the Principal Investigator Seb Oliver, University of Susssex, S.Oliver@Sussex.ac.uk http://www.sussex.ac.uk/profiles/91548