European Commission logo
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS

Integrated Data Analysis Pipelines for Large-Scale Data Management, HPC, and Machine Learning

Livrables

Initial prototype of benchmarking toolkit

Prototype of benchmark toolkit for runtime tracing and performance analysis.

Extended Compiler Prototype

Software artifact of an extended compiler prototype that added advanced optimizations and incorporated feedback from the other work packages.

Improved prototype of pipeline, task, and parameter server scheduling

Refined prototype of the pipeline and task/data scheduling mechanisms as well as early prototypes for multi-tenant resource sharing and parameter server update strategies.

Compiler Prototype

Software artifact of the initial compiler prototype

Prototype and overview HW accelerator support and performance models

Prototype of the techniques described in D7.1 and an overview report on the devised performance models.

DSL runtime prototype

Initial prototype of the end-to-end runtime system.

Prototype of pipelines and task scheduling mechanisms

Initial prototype of the pipeline and task/data scheduling mechanisms.

Prototype and overview of managed storage tiers and near-data processing

Report and initial prototype of managed storage tiers and near-data processing for first and second order functions.

Prototype and overview code generation framework

Prototype and report describing the extended code generation framework for GPUs and other accelerators, with special focus on the generation of sparsity-exploiting fused operators.

Prototype and overview of data path optimizations and placement

Report and prototype of used data path optimization techniques and automatic data placement in hybrid memory and storage configurations.

2nd Annual Project Report

Public report describing the project progress until M24, achievements and impact, as well as a calculation of efforts and costs.

DSL runtime design

Report on the initial design of distribution primitives and existing framework integration

Initial System Architecture

Report on requirements of endtoend data analysis pipelines and design of the initial system architecture

Compiler Design and Overview

Report on the overall design and high-level overview of the internals and key techniques.

Initial benchmark concept and definition

Report on the DaphneBench concept and detailed benchmark specification.

Scheduler design for pipelines and tasks

Report on the initial overall design of the scheduling components scheduling of pipelines and workflows as well as task and data placement

3rd Annual Project Report

Public report describing the project progress until M36, achievements and impact, as well as a calculation of efforts and costs.

SotA survey of benchmarks from DM, HPC, and ML Sys

Report on the stateoftheart of benchmarks for database systems data management highperformance computing and ML systems

Initial pipeline definition all use cases

Report on use case studies with technical details and the definition of initial pipelines that can be used for testing

Language Design Specification

Report on the language abstractions APIs and DSL as well as the central internal representation

Refined System Architecture

Report on the final, i.e., refined and improved system architecture.

Design of integration HW accelerators

Report on the planned overall design of integration HW accelerators as well as details on accelerated operations and primitives as well as its compiler and runtime support

Report on search space analysis, automatic capability configuration

Report on stateoftheart techniques for computational storage neardata processing and potential side effects as well as an overview of automatically determining the capabilities of a storage configuration

Improved DSL runtime prototype and overview

Report and prototype of the improved runtime system (primitives, framework integration, local/distributed operations, data access).

Improved pipelines all use case studies

Report on extended pipelines that improve runtime and/or accuracy by leveraging the DAPHNE system infrastructure and exploiting the eminent runtime-accuracy trade-off.

1st Annual Project Report

Public report describing the project progress until M12 achievements and impact as well as a calculation of efforts and costs

Publications

DaphneSched: A Scheduler for Integrated Data Analysis Pipelines

Auteurs: Ahmed Eleliemy, Florina M. Ciorba
Publié dans: ISPDC23; IEEE, 2023
Éditeur: ISPDC23; IEEE

I/O Interface Independence with xNVMe

Auteurs: Simon Lund, Philippe Bonnet, Klaus Jensen, Javier Gonzalez
Publié dans: Proceedings of the 15th ACM International Systems and Storage Conference, Numéro annually, 2022
Éditeur: ACM - Association for Computing Machinery

DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines

Auteurs: Patrick Damme, Marius Birkenbach, Constantinos Bitsakos, Matthias Boehm, Philippe Bonnet, Florina Ciorba, Mark Dokter, Pawel Dowgiallo, Ahmed Eleliemy, Christian Faerber, Georgios Goumas, Dirk Habich, Niclas Hedam, Marlies Hofer, Wenjun Huang, Kevin Innerebner, Vasileios Karakostas, Roman Kern, Tomaž Kosar, Daniel Krems, Andreas Laber, Wolfgang Lehner, Eric Mier, Marcus Paradies, Bernhard Peischl
Publié dans: Conference on Innovative Data Systems Research, CIDR, Numéro 9.1.2022-12.1.2022, 2022
Éditeur: Conference on Innovative Data Systems Research, CIDR

Micro-architectural Analysis of a Learned Index

Auteurs: Mikkel Møller Andersen, Pınar Tözün
Publié dans: Proceedings of the International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, Numéro annually, 2022
Éditeur: ACM - Association for Computing Machinery
DOI: 10.1145/3533702.3534917

Not your Grandpa's SSD: The Era of Co-Designed Storage Devices

Auteurs: Alberto Lerner, Philippe Bonnet
Publié dans: Proceedings of the 2021 International Conference on Management of Data, 2021
Éditeur: ACM

Evaluating Multi-GPU Sorting with Modern Interconnects

Auteurs: Tobias Maltenberger, Ivan Ilic, Ilin Tolovski, Tilmann Rabl
Publié dans: Proceedings of the 2022 International Conference on Management of Data (SIGMOD ’22), Numéro annually, 2022
Éditeur: ACM - Association for Computing Machinery

Efficient Multi-Model Management

Auteurs: Nils Strassenburg, Dominic Kupfer, Julia Kowal, Tilmann Rabl
Publié dans: 26th International Conference on Extending Database Technology (EDBT), Numéro annually, 2023
Éditeur: OpenProceedings.org

Enabling Integrated Data Analysis Pipelines on Heterogeneous Hardware through Holistic Extensibility

Auteurs: Patrick Damme, Matthias Boehm
Publié dans: 2nd Workshop on Novel Data Management Ideas on Heterogeneous Hardware Architectures (NoDMC), 2023
Éditeur: Gesellschaft für Informatik

A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks

Auteurs: Nina Ihde, Paula Marten, Ahmed Eleliemy, Gabrielle Poerwawinata, Pedro Silva, Ilin Tolovski, Florina M. Ciorba, Tilmann Rabl
Publié dans: Proceedings of the Thirteenth TPC Technology Conference on Performance Evaluation & Benchmarking, 2021
Éditeur: Springer

Delilah: eBPF-offload on computational storage

Auteurs: Niclas Hedam, Morten Tychsen Clausen, Philippe Bonnet, Sangjin Lee, Ken Friis Larsen
Publié dans: 19th International Workshop on Data Management on New Hardware (DaMoN), Numéro annually, 2023
Éditeur: ACM

Parallelization of benchmarking using HPC: text summarization in natural language processing (NLP), glider piloting in deep-sea missions, and search algorithms in computational intelligence (CI)

Auteurs: Aleš Zamuda
Publié dans: Proceedings of the Austrian-Slovenian HPC Meeting 2021 - ASHPC21, 2021, ISBN 978-961-6980-77-7
Éditeur: University of Ljubljana

DeGNN: Improving Graph Neural Networks with Graph Decomposition

Auteurs: Miao, Xupeng; Gürel, Nezihe Merve; id_orcid0000-0002-4747-2406; Zhang, Wentao; Han, Zhichao; Li, Bo; Min, Wei; Rao, Susie; id_orcid0000-0003-2379-1506; Ren, Hansheng; Shan, Yinan; Shao, Yingxia; Wang, Yujie; Wu, Fan; Xue, Hui; Yang, Yaming; Zhang, Zitao; Zhao, Yang; Zhang, Shuai; id_orcid0000-0002-7866-4611; Wang, Yujing; Cui, Bin; Zhang, Ce
Publié dans: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD '21), Numéro annually, 2021
Éditeur: ACM

Accelerating Parallel Operation for Compacting Selected Elements on GPUs

Auteurs: Johannes Fett, Urs Kober, Christian Schwarz, Dirk Habich, Wolfgang Lehner
Publié dans: Euro-Par2022, Numéro annually, 2022, ISBN 9798400707834
Éditeur: 28th International European Conference on Parallel and Distributed Computing

Evaluating In-Memory Hash Joins on Persistent Memory

Auteurs: Tobias Maltenberger, Till Lehmann, Lawrence Benson, Tilmann Rabl
Publié dans: 25th International Conference on Extending Database Technology (EDBT), Numéro annually, 2022
Éditeur: OpenProceedings.org
DOI: 10.48786/edbt.2022.23

Evaluating SIMD Compiler-Intrinsics for Database Systems

Auteurs: Lawrence Benson, Richard Ebeling and Tilmann Rabl
Publié dans: VLDBW -- ADMS 23, 2023
Éditeur: ACM - Association for Computing Machinery

TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems

Auteurs: Christoph Brücke, Philipp Härtling, Rodrigo D Escobar Palacios, Hamesh Patel and Tilmann Rabl
Publié dans: 2023
Éditeur: VLDB23; ACM - Association for Computing Machinery

Desis: Efficient Window Aggregation in Decentralized Networks

Auteurs: Wang Yue, Lawrence Benson, Tilmann Rabl
Publié dans: 26th International Conference on Extending Database Technology (EDBT), Numéro annually, 2023
Éditeur: OpenProceedings.org

Darwin: Scale-In Stream Processing

Auteurs: Lawrence Benson, Tilmann Rabl
Publié dans: Conference on Innovative Data Systems Research, CIDR 22, Numéro annually, 2022
Éditeur: Conference on Innovative Data Systems Research, CIDR 22

Analyzing Vectorized Hash Tables across CPU Architectures

Auteurs: Maximilian Böther; Lawrence Benson; Ana Klimovic; Tilmann Rabl
Publié dans: Proceedings of the VLDB Endowment, 16 (11), Numéro 25, 2023
Éditeur: ACM - Association for Computing Machinery
DOI: 10.14778/3611479.3611485

Considering a Fear and Greed Index in Bitcoin Price Prediction Through Long Short-Term Memory

Auteurs: Nataša Ošep Ferš, Aleš Zamuda
Publié dans: IEEE Slovenia Section, Numéro annually, 2021
Éditeur: IEEE

Maximizing Persistent Memory Bandwidth Utilization for OLAP Workloads

Auteurs: Björn Daase, Lars Jonas Bollmeier, Lawrence Benson, Tilmann Rabl
Publié dans: Proceedings of the 2021 International Conference on Management of Data (SIGMOD 2021), 2021
Éditeur: ACM

Solving 100-Digit Challenge with Score 100 by Extended Running Time and Parallel Benchmarking

Auteurs: Aleš Zamuda
Publié dans: Proceedings of the 17th International Symposium on Operational Research in Slovenia, Numéro bi-annually, 2023, ISBN 978-961-6165-61-7
Éditeur: Slovenian Society INFORMATIKA – Section for Operational Research

DaphneSched: A Scheduler for Integrated Data Analysis Pipelines

Auteurs: Ahmed Eleliemy, Florina M. Ciorba, Jonas H. Müller Korndörfer
Publié dans: ISPDC23, 2023
Éditeur: ISPDC23

How Do OS and Application Schedulers Interact? An Investigation with Multithreaded Applications

Auteurs: Jonas H. Müller Korndorfer, Ahmed Eleliemy, Osman Simsek, Thomas Ilsche, Robert Schöne, Florina M. Ciorba
Publié dans: Springer, 2023, ISBN 978-3-031-39697-7
Éditeur: Euro-Par 2023

Near to Far: An Evaluation of Disaggregated Memory for In-Memory Data Processing

Auteurs: Andreas Geyer, Johannes Pietrzyk, Alexander Krause, Dirk Habich, Wolfgang Lehner, Christian Färber, Thomas Willhalm
Publié dans: DIMES@SOSP 2023: 1st workshop on Disruptive Memory Systems, 2023
Éditeur: ACM Symposium on Operating Systems Principles (SOSP)

PerMA-Bench: Benchmarking Persistent Memory Access

Auteurs: Benson, Lawrence and Papke, Leon and Rabl, Tilmann
Publié dans: Proceedings of the Very Large Data Base Endowment (VLDB) Endowment, Numéro annually, 2022
Éditeur: ACM - Association for Computing Machinery
DOI: 10.14778/3551793.3551807

Predicting Ion Beam Tuning in Semiconductor Manufacturing

Auteurs: Andreas Laber, Martin Gebser, Konstantin Schekotihin, Yao Yang
Publié dans: IEEE Electron Devices Society, Numéro bi-annually, 2022
Éditeur: IEEE

BabelMR: A Polyglot Framework for Serverless MapReduce

Auteurs: Fabian Mahling, Paul Rößler, Thomas Bodner and Tilmann Rabl
Publié dans: VLDBW -- SDA 23, 2023
Éditeur: ACM - Association for Computing Machinery

VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition

Auteurs: Li, Yang; Shen, Yu; Zhang, Wentao; Jiang, Jiawei; Ding, Bolin; Li, Yaliang; Zhou, Jingren; Yang, Zhi; Wu, Wentao; Zhang, Ce; Cui, Bin
Publié dans: Proceedings of the VLDB Endowment, 14 (11), Numéro annually, 2021
Éditeur: PVLDB

Ease. ML: A Lifecycle Management System for Machine Learning

Auteurs: Aguilar Melgar, Leonel; id_orcid0000-0001-6864-4492; Dao, David; Gan, Shaoduo; Gürel, Nezihe M.; Hollenstein, Nora; id_orcid0000-0001-7936-4170; Jiang, Jiawei; Karlaš, Bojan; Lemmin, Thomas; id_orcid0000-0001-5705-4964; Li, Tian; Li, Yang; Rao, Susie; id_orcid0000-0003-2379-1506; Rausch, Johannes; Renggli, Cedric; Rimanic, Luka; Weber, Maurice; Zhang, Shuai; id_orcid0000-0002-7866-4611; Zhao, Zh
Publié dans: Proceedings of the Annual Conference on Innovative Data Systems Research (CIDR), 2021, Numéro 1, 2021
Éditeur: CIDR 2021
DOI: 10.3929/ethz-b-000458916

Drop It In Like It’s Hot: An Analysis of Persistent Memory as a Drop-in Replacement for NVMe SSDs

Auteurs: Maximilian Böther, Otto Kißig, Lawrence Benson, Tilmann Rabl
Publié dans: International Workshop on Data Management on New Hardware (DAMON’21), 2021
Éditeur: ACM SIGMOD/PODS

Efficiently Managing Deep Learning Models in a Distributed Environment

Auteurs: Nils Strassenburg, Ilin Tolovski, Tilmann Rabl
Publié dans: 25th International Conference on Extending Database Technology (EDBT), Numéro annually, 2022
Éditeur: OpenProceedings.org
DOI: 10.48786/edbt.2022.12

DaxVM: Stressing the Limits of Memory as a File Interface

Auteurs: Chloe Averti, Vasileios Karakostas, Nikhita Kunati, Georgios Goumas, Michael Swift
Publié dans: MICRO 2022 - 55th IEEE/ACM International Synopsium on Microarchitecture, Numéro annually, 2022
Éditeur: ACM/IEEE

A Resourceful Coordination Approach for Multilevel Scheduling

Auteurs: Eleliemy, Ahmed; Ciorba, Florina M.
Publié dans: International Conference on High Performance Computing & Simulation (HPCS) 2021, Numéro annual, 2021
Éditeur: HPCS

TPCx-AI on NVIDIA Jetsons

Auteurs: Robert Bayer, Jon Voigt Tøttrup, and Pınar Tözün
Publié dans: Proceedings of the Fourteenth TPC Technology Conference on Performance Evaluation & Benchmarking, 2022
Éditeur: ACM - Association for Computing Machinery

RMG Sort: Radix-Partitioning-Based Multi-GPU Sorting

Auteurs: Ivan Ilic, Ilin Tolovski, Tilmann Rabl
Publié dans: Datenbanksysteme für Business, Technologie und Web (BTW 2023), Numéro bi-annually, 2023
Éditeur: Springer

Viper: An Efficient Hybrid PMem-DRAM Key-Value Store

Auteurs: Lawrence Benson, Hendrik Makait, Tilmann Rabl
Publié dans: 2021
Éditeur: ACM

Analyzing Vectorized Hash Tables Across CPU Architectures

Auteurs: Maximilian Böther, Lawrence Benson, Ana Klimovic, Tilmann Rabl
Publié dans: VLDB23, 2023
Éditeur: ACM - Association for Computer Machinery

DocParser: Hierarchical Document Structure Parsing from Renderings

Auteurs: Rausch, Johannes; Martinez, Octavio; Bissig, Fabian; Zhang, Ce; Feuerriegel, Stefan
Publié dans: Proceedings of the AAAI Conference on Artificial Intelligence, 35 (5), 2021, Page(s) 4328-4338, ISSN 2159-5399
Éditeur: AAAI Press
DOI: 10.13039/501100000780

BAGUA: Scaling up Distributed Learning with System Relaxations

Auteurs: Shaoduo Gan, Xiangru Lian, Rui Wang, Jianbin Chang, Chengjun Liu, Hongmei Shi, Shengzhuo Zhang, Xianghong Li, Tengxu Sun, Jiawei Jiang, Binhang Yuan, Sen Yang,
Publié dans: PVLDB, 2021, ISSN 2150-8097
Éditeur: PVLDB

The urban morphology on our planet – Global perspectives from space

Auteurs: Xiao Xiang Zhu,Chunping, Qiu, Jingliang Hua, Yilei Shi, Yuanyuan Wang, Michael Schmitta, Hannes Taubenböck
Publié dans: Remote Sensing of Environment, Numéro 16 volumes / year, 2021, ISSN 0034-4257
Éditeur: Elsevier BV
DOI: 10.1016/j.rse.2021.112794

Micro-architectural analysis of in-memory OLTP: Revisited

Auteurs: Utku Sirin, Pınar Tözün, Danica Porobic, Ahmad Yasin, Anastasia Ailamaki
Publié dans: The VLDB Journal, Volume 30, Numéro every other month, July 2021, 2021, ISSN 1066-8888
Éditeur: Springer Verlag
DOI: 10.1007/s00778-021-00663-8

PerMA-bench

Auteurs: Benson, Lawrence; Papke, Leon; Rabl, Tilmann
Publié dans: Proceedings of the Very Large Data Base Endowment (VLDB) Endowment, Numéro annually, 2022, ISSN 2150-8097
Éditeur: ACM - Association for Computing Machinery

Better Database Cost/Performance via Batched I/O on Programmable SSD

Auteurs: Jaeyoung Do, Ivan Luiz Picoli, David Lomet, Philippe Bonnet
Publié dans: Conference on Very Large Data Bases (VLDB Journal), Numéro 18.2.2021, 2021, ISSN 1066-8888
Éditeur: Springer Verlag
DOI: 10.1007/s00778-020-00648-z

LB4OMP: A Dynamic Load Balancing Library for Multithreaded Applications

Auteurs: Jonas H. Müller Korndörfer; Ahmed Eleliemy; Ali Mohammed; Florina M. Ciorba
Publié dans: IEEE Transactions on Parallel and Distributed Systems, Volume 33, Numéro 4, 2021, Page(s) 830 - 841, ISSN 1045-9219
Éditeur: Institute of Electrical and Electronics Engineers
DOI: 10.1109/tpds.2021.3107775

Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP

Auteurs: Ali Mohammed, Jonas H. Müller Kornörfer, Ahmed Eleliemy, Florina M. Ciorba
Publié dans: IEEE Transactions on Parallel and Distributed Systems, Numéro Volume: 33, Numéro: 12, December 2022, 2022, ISSN 1045-9219
Éditeur: Institute of Electrical and Electronics Engineers

Speeding up Vectorized Benchmarking of Optimization Algorithms

Auteurs: Aleš Zamuda
Publié dans: Austrian-Slovenian HPC Meeting 2022 – ASHPC22, Numéro annually, 2022
Éditeur: EuroCC Austria

Don’t Compete, Let’s Cooperate: A Cooperative Scheduling Approach

Auteurs: Ahmed Eleliemy, Florina M. Ciorba
Publié dans: Platform for Advancing Scientific Computing Conference, 2021
Éditeur: PASC

Simplicity done right for SIMDified query processing on CPU and FPGA

Auteurs: Johannes Fett, Urs Kober, Christian Schwarz, Dirk Habich, Wolfgang Lehner
Publié dans: ACM SIGMOD/PODS, 2023, ISBN 9798400707834
Éditeur: ACM SIGMOD/PODS

CleanML: A Study for Evaluating the Impact of Data Cleaning on ML Classification Tasks

Auteurs: Li Peng, Rao Xi, Jennifer Blase, Xu Chu, Yue Zhang, Ce Zhang
Publié dans: DeGNN, 2020
Éditeur: ETH Zurich, Institute for Computing Platforms
DOI: 10.13039/501100001711

Single- and Two-Level Dynamic Load Balancing of Scientific Applications

Auteurs: Ahmed Eleliemy, Florina M. Ciorba
Publié dans: Platform for Advancing Scientific Computing Conference, 2021
Éditeur: PASC

Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations

Auteurs: Francieli Boito, Jim Brandt, Valeria Cardellini, Philip Carns, Florina M. Ciorba, Hilary Egan, Ahmed Eleliemy, Ann Gentile, Thomas Gruber, Jeff Hanson, Utz-Uwe Haus, Kevin Huck, Thomas Ilsche, Thomas Jakobsche, Terry Jones, Sven Karlsson, Abdullah Mueen, Michael Ott, Tapasya Patki, Krishnan Raghavan, Stephen Simms, Kathleen Shoga, Michael Showerman, Devesh Tiwari, Torsten Wilde, Ivy Peng, and Keij
Publié dans: HPCMASPA workshop at IEEE Cluster 2023, Numéro annually, 2023
Éditeur: IEEE

Recherche de données OpenAIRE...

Une erreur s’est produite lors de la recherche de données OpenAIRE

Aucun résultat disponible