European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS

Enabling dynamic and Intelligent workflows in the future EuroHPCecosystem

Periodic Reporting for period 1 - eFlows4HPC (Enabling dynamic and Intelligent workflows in the future EuroHPCecosystem)

Okres sprawozdawczy: 2021-01-01 do 2022-06-30

The goals of a large number of complex scientific and industrial applications are deeply linked to the effective use of high-performance computing infrastructures and the efficient extraction of knowledge from vast amounts of data. This involves large amounts of data coming from different sources that follow a process composed of preprocessing steps for intermediate data curation and preparation for subsequent computing steps, and later analysis and analytics steps applied to produce results. However, workflows are currently fragmented in multiple components, using different programming models, with different processes for computing and data management.
While a large number of current application workflows follow a brute force approach, where the design or research space is blindly explored, new intelligent approaches that make the best use of the computational resources are urgently needed.
eFlows4HPC aims to deliver a workflow software stack and an additional set of services to enable the integration of HPC simulation and modelling with big data analytics and machine learning in scientific and industrial applications. To widen the access to HPC to newcomers, the project will provide HPC Workflows as a Service (HPCWaaS), an environment for sharing, reusing, deploying and executing existing workflows on HPC systems. Specific optimization tasks for the use of accelerators (FPGAs, GPUs) and the EPI will be performed in the project use cases.
The first phase of the project focused on the definition of the requirements, both from the pillar applications and from the different components of the software stack. At the same time and taking into account these requirements, a first version of the software architecture was designed.
Partners have also performed activities towards the definition of abstractions to support the integration of different stack components. A first version of these abstractions with regard software integration has been implemented. To drive software integration, another important step has been the design and development of a minimal workflow that implements a simple case but covers most of the required functionalities in the development, deployment and execution phases of a workflow lifecycle. In addition, the project partners have designed and developed a first version of the HPC Workflows as a Service (HPCWaaS) methodology.
Other activities have been devoted to optimization of the different aspects of the workflow software stack, such as a machine learning based recommendation tool to find the optimal block-size when splitting data sets for parallelization, or new strategies for gathering/scattering data from distributed to centralized data storage. The project has also designed and implemented a data catalogue service which is up and running with links to some project data sets.
The project has also been identifying computational and artificial intelligence kernels that could be performance bottlenecks in the pillar applications. A selected number of this kernels are being optimized for GPUs, FPFAs and for the EPI.
The partners involved in the pillar applications have been defining and implementing the first version of their workflows.
Pillar I, that aims at developing digital twins for manufacturing, has achieved the development of a first “complete” version of the workflow implemented on the basis of the eFlows4HPC software stack, allowing to run end-to-end a full reduction model of the cooling system of a SIEMENS electrical engine for simplified test cases.
Pillar II has been developing two workflows: the Dynamic (AI-assisted) Earth System Model (ESM) workflow that aims at pruning ensembles runs based on runtime analytics; and the Statistical analysis and feature extraction workflow aiming at the prediction of tropical cyclones. The two workflows have been designed and are partially developed and will have a first version available at month 20 as planned.
Pillar III aims at developing a workflow for earthquakes (UCIS4EQ) and another one for is subsequent tsunamis (FTRT/PTF). Both workflows have a similar structure, generating a set of ensemble simulations after an event, with the goal of predicting the impact of the natural hazard, which are followed by other phases that post-process and analyze the data. The two workflows have been designed and partially developed and will have a first version available at month 20 as planned.
The project has delivered a first version of the project software (software stack and HPCWaaS methodology) available as open source in public repositories and online documentation.
The consortium has also performed a set of internal trainings about software stack components and about the HPCWaaS. Overall the project has achieved good visibility through the publication of articles, presentation of conference keynotes and invited presentations, and presence in media.
The current methodologies available to develop scientific workflows:
a) Do not fulfill the requirements of increasingly complex applications, which need novel methodologies that support in a holistic workflow HPC simulations or modelling, data analytics, and machine learning.
b) Are not able to get the best of the complexity of the underlying infrastructure, which is distributed in nature. The infrastructure we consider is composed of a large number of nodes with new types of processors with multiple cores (including accelerators such as GPUs), new storage devices that have the potential to change the way data is stored by applications, and connection to external instruments, edge devices and cloud storage as sources of data.
c) Do not include techniques for dynamically adapting the execution of workflows on computing platforms.
d) Do not provide means to easily deploy, execute and reuse workflows in HPC systems.
d) Do not provide a fully integrated approach to the management of both HPC and Big Data analytics requirements for providing data-oriented frameworks easy to be extended with additional functionalities related both to HPC tasks and data analytics techniques.
e) Neglect, or do not address to the full extent, the challenge of making the required data available for processing on time, in expected format, and quality.
eFlows4HPC will do progress beyond the state-of-the-art proposing: ground-breaking solutions for interfaces for the development of scientific workflows that will integrate HPC applications, machine learning and big data; novel intelligent runtimes that will optimize the execution of the workflows, reducing the space exploration and the energy required for its execution; innovative tools for workflow modelling, always leveraging on European solutions. It will define and implement an open methodology that widens and eases the use of workflows by existing and new HPC communities and users — HPC Workflows as a Service (HPCWaaS). It will provide means to increase openness, transparency, re-usability, and reproducibility of computation results by means of providing catalogs, repositories and registries that store data sets and software components, including whole workflow instances. It will also provide a Data Logistic Service to fuel the data-intensive workflows with relevant, up-to-date data in a reproducible and transparent manner and Data Catalogue to list potentially relevant data sources.
eFlows4HPC overall approach

Powiązane dokumenty