Skip to main content

Integrated Data Analysis Pipelines for Large-Scale Data Management, HPC, and Machine Learning

Objective

Modern data-driven applications leverage large, heterogeneous data collections to find interesting patterns, and build robust machine learning (ML) models for accurate predictions. Large data sizes and advanced analytics spurred the development and adoption of data-parallel computation frameworks like Apache Spark or Flink as well as distributed ML systems like MLlib, TensorFlow, or PyTorch. A key observation is that these new systems share many techniques with traditional high-performance computing (HPC), and the architecture of underlying HW clusters converges. Yet, the programming paradigms, cluster resource management, as well as data formats and representations differ substantially across data management, HPC, and ML software stacks. There is a trend though, toward complex data analysis pipelines that combine these different systems. Examples are workflows of distributed data pre-processing, tuned HPC libraries, and dedicated ML systems, but also HPC applications that leverage ML models for more cost-effective simulation. Major obstacles are (1) limited development productivity for integrated analysis pipelines due to different programming models, and separated cluster environments, (2) unnecessary data movement overhead and underutilization due to separate, statically provisioned clusters, and (3) lack of a common system infrastructure with good interoperability. For these reasons, DAPHNE’s overall objective is the definition of an open and extensible systems infrastructure for integrated data analysis pipelines. We aim at building a reference implementation of language abstractions (i.e., APIs and a domain-specific language), an intermediate representation, as well as compilation and runtime techniques with support for integrating and scheduling heterogeneous accelerator and storage devices. A variety of real-world, high-impact use cases, datasets, and a new benchmark will be used for qualitative and quantitative analysis compared to state-of-the-art.

Field of science

  • /natural sciences/chemical sciences/analytical chemistry/quantitative analysis
  • /humanities/languages and literature/languages - general
  • /natural sciences/computer and information sciences/artificial intelligence/machine learning
  • /natural sciences/computer and information sciences/data science/data analysis

Call for proposal

H2020-ICT-2020-1
See other projects for this call

Funding Scheme

RIA - Research and Innovation action

Coordinator

KNOW-CENTER GMBH RESEARCH CENTER FOR DATA-DRIVEN BUSINESS & BIG DATA ANALYTICS
Address
Inffeldgasse 13/6
8010 Graz
Austria
Activity type
Research Organisations
EU contribution
€ 962 888,75

Participants (12)

AVL LIST GMBH
Austria
EU contribution
€ 419 175
Address
Hans-list-platz 1
8020 Graz
Activity type
Private for-profit entities (excluding Higher or Secondary Education Establishments)
DEUTSCHES ZENTRUM FUR LUFT - UND RAUMFAHRT EV
Germany
EU contribution
€ 849 830
Address
Linder Hohe
51147 Koln
Activity type
Research Organisations
EIDGENOESSISCHE TECHNISCHE HOCHSCHULE ZUERICH
Switzerland
EU contribution
€ 448 032,50
Address
Raemistrasse 101
8092 Zuerich
Activity type
Higher or Secondary Education Establishments
HASSO-PLATTNER-INSTITUT FUR DIGITAL ENGINEERING GGMBH
Germany
EU contribution
€ 458 750
Address
Prof Dr Helmert Strasse 2-3
14482 Potsdam
Activity type
Research Organisations
INSTITUTE OF COMMUNICATION AND COMPUTER SYSTEMS
Greece
EU contribution
€ 415 000
Address
Patission Str. 42
10682 Athina
Activity type
Research Organisations
INFINEON TECHNOLOGIES AUSTRIA AG
Austria
EU contribution
€ 436 015
Address
Siemensstrasse 2
9500 Villach
Activity type
Private for-profit entities (excluding Higher or Secondary Education Establishments)
INTEL TECHNOLOGY POLAND SPOLKA Z OGRANICZONA ODPOWIEDZIALNOSCIA
Poland
EU contribution
€ 271 375
Address
Ul Slowackiego 173
80 298 Gdansk
Activity type
Private for-profit entities (excluding Higher or Secondary Education Establishments)
IT-UNIVERSITETET I KOBENHAVN
Denmark
EU contribution
€ 523 100
Address
Rued Langgaardsvej 7
2300 Kobenhavn
Activity type
Higher or Secondary Education Establishments
KAI KOMPETENZZENTRUM AUTOMOBIL - UND INDUSTRIEELEKTRONIK GMBH
Austria
EU contribution
€ 470 875
Address
Europastrasse 8
9524 Villach St Magdalen
Activity type
Private for-profit entities (excluding Higher or Secondary Education Establishments)
TECHNISCHE UNIVERSITAET DRESDEN
Germany
EU contribution
€ 409 500
Address
Helmholtzstrasse 10
01069 Dresden
Activity type
Higher or Secondary Education Establishments
UNIVERZA V MARIBORU
Slovenia
EU contribution
€ 244 975
Address
Slomskov Trg 15
2000 Maribor
Activity type
Higher or Secondary Education Establishments
UNIVERSITAT BASEL
Switzerland
EU contribution
€ 700 148,75
Address
Petersplatz 1
4051 Basel
Activity type
Higher or Secondary Education Establishments