CORDIS
EU research results

CORDIS

English EN
REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems

REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems

Objective

The current HPC facilities will need to grow by an order of magnitude in the next few years to reach the Exascale range. The dedicated middleware needed to manage the enormous complexity of future HPC centers, where deep heterogeneity is needed to handle the wide variety of applications within reasonable power budgets, will be one of the most critical aspects in the evolution of HPC infrastructure towards Exascale. This middleware will need to address the critical issue of reliability in face of the increasing number of resources, and therefore decreasing mean time between failures.
To close this gap, RECIPE provides: a hierarchical runtime resource management infrastructure optimizing energy efficiency and ensuring reliability for both time-critical and throughput-oriented computation; a predictive reliability methodology to support the enforcing of QoS guarantees in face of both transient and long-term hardware failures, including thermal, timing and reliability models; and a set of integration layers allowing the resource manager to interact with both the application and the underlying deeply heterogeneous architecture, addressing them in a disaggregate way.
Quantitative goals for RECIPE include: 25% increase in energy efficiency (performance/watt) with an 15% MTTF improvement due to proactive thermal management; energy-delay product improved up to 25%; 20% reduction of faulty executions.
The project will assess its results against the following set of real world use cases, addressing key application domains ranging from well established HPC applications such as geophysical exploration and meteorology, to emerging application domains such as biomedical machine learning and data analytics.
To this end, RECIPE relies on a consortium composed of four leading academic partners (POLIMI,UPV,EPFL,CeRICT); two supercomputing centers, BSC and PSNC; a research hospital, CHUV, and an SME, IBTS, which provide effective exploitation avenues through industry-based use cases.

Coordinator

POLITECNICO DI MILANO

Address

Piazza Leonardo Da Vinci 32
20133 Milano

Italy

Activity type

Higher or Secondary Education Establishments

EU Contribution

€ 705 000

Participants (7)

Sort alphabetically

Sort by EU Contribution

Expand all

UNIVERSITAT POLITECNICA DE VALENCIA

Spain

EU Contribution

€ 437 000

Centro Regionale Information Communication Technology scrl

Italy

EU Contribution

€ 395 500

BARCELONA SUPERCOMPUTING CENTER - CENTRO NACIONAL DE SUPERCOMPUTACION

Spain

EU Contribution

€ 410 500

INSTYTUT CHEMII BIOORGANICZNEJ POLSKIEJ AKADEMII NAUK

Poland

EU Contribution

€ 397 250

ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE

Switzerland

EU Contribution

€ 465 250

INTELLIGENCE BEHIND THINGS SOLUTIONS SRL

Italy

EU Contribution

€ 290 500

CENTRE HOSPITALIER UNIVERSITAIRE VAUDOIS

Switzerland

EU Contribution

€ 184 300

Project information

Grant agreement ID: 801137

Status

Ongoing project

  • Start date

    1 May 2018

  • End date

    30 April 2021

Funded under:

H2020-EU.1.2.2.

  • Overall budget:

    € 3 290 800

  • EU contribution

    € 3 285 300

Coordinated by:

POLITECNICO DI MILANO

Italy