Skip to main content

Data-centric Parallel Programming

Publications

A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning

Author(s): Tal Ben-Nun, Maciej Besta, Simon Huber, Alexandros Nikolaos Ziogas, Daniel Peter, Torsten Hoefler
Published in: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2019, Page(s) 66-77, ISBN 978-1-7281-1246-6
Publisher: IEEE
DOI: 10.1109/ipdps.2019.00018

Mitigating network noise on Dragonfly networks through application-aware routing

Author(s): Daniele De Sensi, Salvatore Di Girolamo, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, Page(s) 1-32, ISBN 9781450362290
Publisher: ACM
DOI: 10.1145/3295500.3356196

Productivity, portability, performance - data-centric Python

Author(s): Alexandros Nikolaos Ziogas, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Tiziano De Matteis, Johannes de Fine Licht, Luca Lavarini, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021, Page(s) 1-13, ISBN 9781450384421
Publisher: ACM
DOI: 10.1145/3458817.3476176

Taming unbalanced training workloads in deep learning with partial collective operations

Author(s): Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler
Published in: Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020, Page(s) 45-61, ISBN 9781450368186
Publisher: ACM
DOI: 10.1145/3332466.3374528

Predicting Weather Uncertainty with Deep Convnets

Author(s): Grönquist, Peter; Ben-Nun, Tal; Dryden, Nikoli; Dueben, Peter; Lavarini, Luca; Li, Shigang; Hoefler, Torsten
Published in: 1, 2019
Publisher: Arxiv

Naos: Serialization-free RDMA networking in Java

Author(s): Taranov, Konstantin Bruno, Rodrigo Alonso, Gustavo Hoefler, Torsten
Published in: 2021
Publisher: USENIX Association

ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations

Author(s): Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F P O’Boyle, Hugh Leather
Published in: 2021
Publisher: International Conference on Machine Learning

On the parallel I/O optimality of linear algebra kernels - near-optimal LU factorization

Author(s): Grzegorz Kwasniewski, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Timo Schneider, Maciej Besta, Torsten Hoefler
Published in: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021, Page(s) 463-464, ISBN 9781450382946
Publisher: ACM
DOI: 10.1145/3437801.3441590

SparCML - high-performance sparse communication for machine learning

Author(s): Cedric Renggli, Saleh Ashkboos, Mehdi Aghagolzadeh, Dan Alistarh, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, Page(s) 1-15, ISBN 9781450362290
Publisher: ACM
DOI: 10.1145/3295500.3356222

Streaming message interface - high-performance distributed memory programming on reconfigurable hardware

Author(s): Tiziano De Matteis, Johannes de Fine Licht, Jakub Beránek, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, Page(s) 1-33, ISBN 9781450362290
Publisher: ACM
DOI: 10.1145/3295500.3356201

SeBS - a serverless benchmark suite for function-as-a-service computing

Author(s): Marcin Copik, Grzegorz Kwasniewski, Maciej Besta, Michal Podstawski, Torsten Hoefler
Published in: Proceedings of the 22nd International Middleware Conference, 2021, Page(s) 64-78, ISBN 9781450385343
Publisher: ACM
DOI: 10.1145/3464298.3476133

An In-Depth Analysis of the Slingshot Interconnect

Author(s): De Sensi, Daniele; Di Girolamo, Salvatore; McMahon, Kim H.; Roweth, Duncan; Hoefler, Torsten
Published in: 1, 2020
Publisher: IEEE

Chimera: efficiently training large-scale neural networks with bidirectional pipelines

Author(s): Shigang Li, Torsten Hoefler
Published in: 2021
Publisher: ACM

Stateful dataflow multigraphs - a data-centric model for performance portability on heterogeneous architectures

Author(s): Tal Ben-Nun, Johannes de Fine Licht, Alexandros N. Ziogas, Timo Schneider, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, Page(s) 1-14, ISBN 9781450362290
Publisher: ACM
DOI: 10.1145/3295500.3356173

NPBench - a benchmarking suite for high-performance NumPy

Author(s): Alexandros Nikolaos Ziogas, Tal Ben-Nun, Timo Schneider, Torsten Hoefler
Published in: Proceedings of the ACM International Conference on Supercomputing, 2021, Page(s) 63-74, ISBN 9781450383356
Publisher: ACM
DOI: 10.1145/3447818.3460360

CoRM - Compactable Remote Memory over RDMA

Author(s): Konstantin Taranov, Salvatore Di Girolamo, Torsten Hoefler
Published in: Proceedings of the 2021 International Conference on Management of Data, 2021, Page(s) 1811-1824, ISBN 9781450383431
Publisher: ACM
DOI: 10.1145/3448016.3452817

Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One Shot

Author(s): Tobias Gysi, Tobias Grosser, Torsten Hoefler
Published in: 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2019, Page(s) 370-382, ISBN 978-1-7281-3613-4
Publisher: IEEE
DOI: 10.1109/pact.2019.00036

Log(graph) - a near-optimal high-performance graph representation

Author(s): Maciej Besta, Dimitri Stanojevic, Tijana Zivic, Jagpreet Singh, Maurice Hoerold, Torsten Hoefler
Published in: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018, Page(s) 1-13, ISBN 9781450359863
Publisher: ACM
DOI: 10.1145/3243176.3243198

Red-blue pebbling revisited - near optimal parallel matrix-matrix multiplication

Author(s): Grzegorz Kwasniewski, Marko Kabić, Maciej Besta, Joost VandeVondele, Raffaele Solcà, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, Page(s) 1-22, ISBN 9781450362290
Publisher: ACM
DOI: 10.1145/3295500.3356181

Pebbles, Graphs, and a Pinch of Combinatorics - Towards Tight I/O Lower Bounds for Statically Analyzable Programs

Author(s): Grzegorz Kwasniewski, Tal Ben-Nun, Lukas Gianinazzi, Alexandru Calotoiu, Timo Schneider, Alexandros Nikolaos Ziogas, Maciej Besta, Torsten Hoefler
Published in: Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures, 2021, Page(s) 328-339, ISBN 9781450380706
Publisher: ACM
DOI: 10.1145/3409964.3461796

Designing scalable FPGA architectures using high-level synthesis

Author(s): J. de Fine Licht, M. Blott, T. Hoefler
Published in: 2018
Publisher: ACM

Neural Code Comprehension: A Learnable Representation of Code Semantics

Author(s): Tal Ben-Nun, Alice Shoshana Jakobovits, Torsten Hoefler
Published in: Advances in Neural Information Processing Systems 31, 2018
Publisher: Curran Associates

A data-centric approach to extreme-scale ab initio dissipative quantum transport simulations

Author(s): Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '19, 2019, Page(s) 1-13, ISBN 9781-450362290
Publisher: ACM Press
DOI: 10.1145/3295500.3357156

Optimizing the data movement in quantum transport simulations via data-centric parallel programming

Author(s): Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '19, 2019, Page(s) 1-17, ISBN 9781-450362290
Publisher: ACM Press
DOI: 10.1145/3295500.3356200

A fast analytical model of fully associative caches

Author(s): Tobias Gysi, Tobias Grosser, Laurin Brandner, Torsten Hoefler
Published in: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2019, 2019, Page(s) 816-829, ISBN 9781-450367127
Publisher: ACM Press
DOI: 10.1145/3314221.3314606

Embedding Functions Into Reversible Circuits - A Probabilistic Approach to the Number of Lines

Author(s): Niels Gleinig, Frances Ann Hubis, Torsten Hoefler
Published in: Proceedings of the 56th Annual Design Automation Conference 2019 on - DAC '19, 2019, Page(s) 1-6, ISBN 9781-450367257
Publisher: ACM Press
DOI: 10.1145/3316781.3317814

Substream-Centric Maximum Matchings on FPGA

Author(s): Maciej Besta, Marc Fischer, Tal Ben-Nun, Johannes De Fine Licht, Torsten Hoefler
Published in: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '19, 2019, Page(s) 152-161, ISBN 9781-450361378
Publisher: ACM Press
DOI: 10.1145/3289602.3293916

sPIN - High-performance streaming Processing In the Network

Author(s): Torsten Hoefler, Salvatore Di Girolamo, Konstantin Taranov, Ryan E. Grant, Ron Brightwell
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, Page(s) 1-16, ISBN 9781450351140
Publisher: ACM
DOI: 10.1145/3126908.3126970

Slim graph - practical lossy graph compression for approximate graph processing, storage, and analytics

Author(s): Maciej Besta, Simon Weber, Lukas Gianinazzi, Robert Gerstenberger, Andrey Ivanov, Yishai Oltchik, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, Page(s) 1-25, ISBN 9781450362290
Publisher: ACM
DOI: 10.1145/3295500.3356182

Corrected trees for reliable group communication

Author(s): Martin Küttler, Maksym Planeta, Jan Bierbaum, Carsten Weinhold, Hermann Härtig, Amnon Barak, Torsten Hoefler
Published in: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, 2019, Page(s) 287-299, ISBN 9781450362252
Publisher: ACM
DOI: 10.1145/3293883.3295721

Network-accelerated non-contiguous memory transfers

Author(s): Salvatore Di Girolamo, Konstantin Taranov, Andreas Kurth, Michael Schaffner, Timo Schneider, Jakub Beránek, Maciej Besta, Luca Benini, Duncan Roweth, Torsten Hoefler
Published in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, Page(s) 1-14, ISBN 9781450362290
Publisher: ACM
DOI: 10.1145/3295500.3356189

StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems

Author(s): J. de Fine Licht, A. Kuster, T. De Matteis, T. Ben-Nun, D. Hofer, T. Hoefler
Published in: 2021
Publisher: tbc

Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis

Author(s): Johannes de Fine Licht, Grzegorz Kwasniewski, Torsten Hoefler
Published in: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020, Page(s) 244-254, ISBN 9781450370998
Publisher: ACM
DOI: 10.1145/3373087.3375296

An Efficient Algorithm for Sparse Quantum State Preparation

Author(s): Niels Gleinig, Torsten Hoefler
Published in: 2021 58th ACM/IEEE Design Automation Conference (DAC), 2021, Page(s) 433-438, ISBN 978-1-6654-3274-0
Publisher: IEEE
DOI: 10.1109/dac18074.2021.9586240

SlimSell: A Vectorizable Graph Representation for Breadth-First Search

Author(s): Maciej Besta, Florian Marending, Edgar Solomonik, Torsten Hoefler
Published in: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017, Page(s) 32-41, ISBN 978-1-5386-3914-6
Publisher: IEEE
DOI: 10.1109/ipdps.2017.93

Transformations of High-Level Synthesis Codes for High-Performance Computing

Author(s): Johannes de Fine Licht, Maciej Besta, Simon Meierhans, Torsten Hoefler
Published in: IEEE Transactions on Parallel and Distributed Systems, 32/5, 2021, Page(s) 1014-1029, ISSN 1045-9219
Publisher: Institute of Electrical and Electronics Engineers
DOI: 10.1109/tpds.2020.3039409

Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging

Author(s): Shigang Li, Tal Ben-Nun, Giorgi Nadiradze, Salvatore Digirolamo, Nikoli Dryden, Dan Alistarh, Torsten Hoefler
Published in: IEEE Transactions on Parallel and Distributed Systems, 2020, Page(s) 1-1, ISSN 1045-9219
Publisher: Institute of Electrical and Electronics Engineers
DOI: 10.1109/tpds.2020.3040606

Deep learning for post-processing ensemble weather forecasts

Author(s): Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, Torsten Hoefler
Published in: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 379/2194, 2021, Page(s) 20200092, ISSN 1364-503X
Publisher: Royal Society of London
DOI: 10.1098/rsta.2020.0092

Trends in Data Locality Abstractions for HPC Systems

Author(s): Didem Unat, Anshu Dubey, Torsten Hoefler, John Shalf, Mark Abraham, Mauro Bianco, Bradford L. Chamberlain, Romain Cledat, H. Carter Edwards, Hal Finkel, Karl Fuerlinger, Frank Hannig, Emmanuel Jeannot, Amir Kamil, Jeff Keasler, Paul H J Kelly, Vitus Leung, Hatem Ltaief, Naoya Maruyama, Chris J. Newburn, Miquel Pericas
Published in: IEEE Transactions on Parallel and Distributed Systems, 28/10, 2017, Page(s) 3007-3020, ISSN 1045-9219
Publisher: Institute of Electrical and Electronics Engineers
DOI: 10.1109/tpds.2017.2703149

Substream-Centric Maximum Matchings on FPGA

Author(s): Maciej Besta, Marc Fischer, Tal Ben-Nun, Dimitri Stanojevic, Johannes De Fine Licht, Torsten Hoefler
Published in: ACM Transactions on Reconfigurable Technology and Systems, 13/2, 2020, Page(s) 1-33, ISSN 1936-7406
Publisher: Association for Computing Machinery (ACM)
DOI: 10.1145/3377871

Demystifying Parallel and Distributed Deep Learning

Author(s): Tal Ben-Nun, Torsten Hoefler
Published in: ACM Computing Surveys, 52/4, 2019, Page(s) 1-43, ISSN 0360-0300
Publisher: Association for Computing Machinary, Inc.
DOI: 10.1145/3320060

Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures

Author(s): Ben-Nun, Tal; Licht, Johannes de Fine; Ziogas, Alexandros Nikolaos; Schneider, Timo; Hoefler, Torsten
Published in: arXiv, 4, 2019
Publisher: arXiv

Predicting Weather Uncertainty with Deep Convnets

Author(s): P. Grönquist, T. Ben-Nun, N. Dryden, P. Dueben, L. Lavarini, S. Li, T. Hoefler
Published in: arXiv, 2019
Publisher: arXiv

Streaming Message Interface: High-Performance DistributedMemory Programming on Reconfigurable Hardware

Author(s): T. De Matteis, J. de Fine Licht, J. Beránek, T. Hoefler
Published in: arXiv, 2019
Publisher: arXiv

FBLAS: Streaming Linear Algebra on FPGA

Author(s): De Matteis, Tiziano; Licht, Johannes de Fine; Hoefler, Torsten
Published in: arXiv, 5, 2019
Publisher: arXiv

A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning

Author(s): T. Ben-Nun, M. Besta, S. Huber, A. Nikolaos Ziogas, D. Peter, T. Hoefler
Published in: arXiv, 2019
Publisher: arXiv

Datasets

Software

StencilFlow

Author(s): Johannes de Fine Licht; Andreas Kuster; Tal Ben-Nun; Tiziano De Matteis; Dominic Hofer; Torsten Hoefler
DOI: 10.5281/zenodo.3878529
Publisher: Zenodo

spcl/gemm_hls v0.9

Author(s): Johannes de Fine Licht; Grzegorz Kwasniewski; Torsten Hoefler
DOI: 10.5281/zenodo.3559536
Publisher: Zenodo

StencilFlow

Author(s): Licht, Johannes De Fine; Kuster, Andreas; Ben-Nun, Tal; Matteis, Tiziano De; Hofer, Dominic; Hoefler, Torsten
DOI: 10.5281/zenodo.3878513
Publisher: Zenodo

DaCe - Data-Centric Parallel Programming Framework

Author(s): Ben-Nun, Tal; de Fine Licht, Johannes; Ziogas, Alexandros Nikolaos; Schneider, Timo; Hoefler, Torsten
DOI: 10.5281/zenodo.3376594; 10.5281/zenodo.3376595
Publisher: Zenodo

StencilFlow CGO 2021 Artifact Evaluation

Author(s): Johannes de Fine Licht; Andreas Kuster; Tiziano De Matteis; Tal Ben-Nun; Dominic Hofer; Torsten Hoefler
DOI: 10.5281/zenodo.4283388; 10.5281/zenodo.4283389
Publisher: Zenodo

Productivity, Portability, Performance: Data-Centric Python (Artifact)

Author(s): Ziogas, Alexandros Nikolaos; Schneider, Timo; Ben-Nun, Tal; Calotoiu, Alexandru; De Matteis, Tiziano; de Fine Licht, Johannes; Lavarini, Luca; Hoefler, Torsten
DOI: 10.5281/zenodo.5155509; 10.5281/zenodo.5155508
Publisher: Zenodo

Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing (library code only)

Author(s): De Sensi, Daniele; Di Girolamo, Salvatore; Hoefler, Torsten
DOI: 10.5281/zenodo.3372785; 10.5281/zenodo.3372784
Publisher: Zenodo