European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS

European joint Effort toward a Highly Productive Programming Environment for Heterogeneous Exascale Computing (EPEEC)

Rezultaty

Revised application specification and porting report

Updated version of the content of D52 with feedback from application developers after preliminary assessment of the porting of simplified test programs This report will also feature the latest performance figures on the simplified test programs after their adaptation to the GASPIOmpSs and OmpSsArgoDSM models This is an outcome of Task 52

User's guide for application developers and system managers

This document is the main outcome of Task 47

Training Plan

This deliverable will specify all training needs, activities, and materials needed during the project. This plan will also include exploiting the existing training channels such as PRACE training centres, among others to participate actively with the suggested trainings, and trying to accommodate the courses to their requirements, if any. The initial idea is to organise two courses each year about the hot topics defined by the various work package leaders, as well as to include sessions in summer schools such as PUMPS, ISC/SC tutorials or BoF sessions, or the organisation of a dedicated PATC course at BSC and the participation in the yearly hackathon.

Final dissemination report

The dissemination report will include the various dissemination activities from the third project year and its analysis

Initial report of OmpSs+OpenACC/OpenMP interoperability

Initial report of the proposed syntax for the combination of OmpSs and accelerator kernels defined with OpenACC and OpenMP syntaxes. Includes semantics and syntax compatibility of OmpSs and OpenACC/OpenMP accelerator kernel definitions, as well as proposals of interoperability improvement for the OpenACC and OpenMP specifications. This is an outcome of Task 3.2.

Report on extrapolation to exascale for full application codes

Technical report detailing the achievements in relation with the exascale enabling of full application codes This is an outcome of Task 54

Dissemination and exploitation plan

The Dissemination and Exploitation Plan will provide tools for the marketing activities that will include the website, customer leaflets, social media, press strategy, scientific publications policy, online material strategy, list of key events to be attended, and calendar of activities. This deliverable will include the initial analysis of the exploitation context, business opportunities, and exploitable results.

Report on prototype of distributed memory models and interoperability

Includes GASPIOmpSs and OmpSscluster over ArgoDSM The report will also review coding productivity and scalability as well as lessons learnt and actions performed in Task 45

First dissemination report

The dissemination report will include the various dissemination activities from the first and a half years and its analysis.

Final application specification and porting report

Updated version of D53 after adaptation of the automatic code generation tool based on feedback from application developers This report will also describe the development actions undertaken on the full application codes while porting them to the GASPIOmpSs and OmpSsArgoDSM modelsThis report will include an assessment of coding productivity performance compared to the latest performance figures on the simplified test programs reported in D53 and energy This is an outcome of Task 53

Initial application specification and porting report

This document will report on the main programming features of application codes and the definition of simplified test programs that are representative of the main compute-intensive, data-intensive, and extreme-data parts of the full application codes. It will also report on the first development actions on these simplified test programs. This is an outcome of Task 5.2.

Specification report on requirements of application codes

This document will describe the required developments on the application codes in view of adopting the GASPI+OmpSs and OmpSs@ArgoDSM models, together with a porting scenario. This is an outcome of Task 5.1.

Final report on approaches considered and solutions adopted

Includes automatic tasking annotation proposals of composability improvement for the OpenACC and OpenMP specifications tasking within accelerators in OmpSs heterogeneous memory management and energy considerations This is an outcome of Tasks 31 to 34 and 36

Initial software prototypes WP4

Early prototype releases of enhanced GPI, ArgoDSM, OmpSs with ArgoDSM support, and BSC tools. This is an outcome of Tasks 4.1 to 4.3.

Testing and development platform setup

This is an outcome of Task 4.4 and will be leveraged during the rest of the project and beyond.

Intermediate software prototypes WP4

Updates from Tasks 4.1 to 4.3.

Final software releases WP4

Final releases of all software components involved in this WP

Data management plan

This deliverable describes the life cycle for all data sets being updated regularly during the development of the project.

Publikacje

Breaking master-slave model between host and FPGAs

Autorzy: Jaume Bosch; Miquel Vidal; Antonio Filgueras; Carlos Alvarez; Daniel Jiménez-González; Xavier Martorell; Eduard Ayguadé
Opublikowane w: Association for Computing Machinery (ACM), Numer 3, 2020, Strona(/y) 419–420, ISBN 978-1-4503-6818-6
Wydawca: Association for Computing Machinery (ACM)
DOI: 10.1145/3332466.3374545

A High-Performance Implementation of Bayesian Matrix Factorization with Limited Communication

Autorzy: Tom Vander Aa, Xiangju Qin, Paul Blomstedt, Roel Wuyts, Wilfried Verachtert, Samuel Kaski
Opublikowane w: International Conference on Computational Science (ICCS 2020), 2020
Wydawca: ICCS
DOI: 10.1007/978-3-030-50433-5_1

Efficient and Eventually Consistent Collective Operations

Autorzy: Roman Iakymchuk; Amandio Faustino; Andrew Emerson; João Barreto; Valeria Bartsch; Rodrigo Rodrigues; José Monteiro
Opublikowane w: Fraunhofer ITWM, Numer 2, 2021
Wydawca: IEEE
DOI: 10.5281/zenodo.4588540

TSOPER: Efficient Coherence-Based Strict Persistency

Autorzy: Per Ekemark; Yuan Yao; Alberto Ros; Konstantinos Sagonas; Stefanos Kaxiras
Opublikowane w: 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2021
Wydawca: IEEE
DOI: 10.1109/hpca51647.2021.00021

Particle-in-Cell Simulation using Asynchronous Tasking

Autorzy: Nicolas Guidotti, Pedro Ceyrat, João Barreto, José Monteiro, Rodrigo Rodrigues, Ricardo Fonseca, Xavier Martorell, Antonio J. Peña
Opublikowane w: 27th European Conference on Parallel and Distributed Computing (Euro-Par 2021), 2021, Strona(/y) vol 12820, pp. 482-498, ISBN 978-3-030-85665-6
Wydawca: Springer
DOI: 10.48550/arxiv.2106.12485

SPHT: Scalable Persistent Hardware Transactions

Autorzy: Daniel Castro, Alexandro Baldassin, João Barreto and Paolo Romano
Opublikowane w: 19th USENIX Conference on File and Storage Technologies (FAST'21), 2021, ISBN 978-1-939133-20-5
Wydawca: Usenix

Towards OmpSs-2 and OpenACC interoperation

Autorzy: Orestis Korakitis, Simon Garcia De Gonzalo, Nicolas Guidotti, João Pedro Barreto, José C. Monteiro, and Antonio J. Peña.
Opublikowane w: PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2022, Strona(/y) 433–434
Wydawca: Association for Computing Machinery
DOI: 10.1145/3503221.3508401

Hardware Locality-Aware Partitioning and Dynamic Load-Balancing of Unstructured Meshes for Large-Scale Scientific Applications

Autorzy: Pavanakumar Mohanamuraly; Gabriel Staffelbach
Opublikowane w: Crossref - PASC '20: Proceedings of the Platform for Advanced Scientific Computing Conference, Numer 1, 2020
Wydawca: ACM
DOI: 10.1145/3394277.3401851

SMURFF: a High-Performance Framework for Matrix Factorization

Autorzy: Tom Vander Aa, Imen Chakroun, Thomas J. Ashby
Opublikowane w: 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2019, Strona(/y) 304-308, ISBN 978-1-5386-7884-8
Wydawca: IEEE
DOI: 10.1109/aicas.2019.8771607

REVIEWING DATA ACCESS PATTERNS AND COMPUTATIONAL REDUNDANCY FOR MACHINE LEARNING ALGORITHMS

Autorzy: Imen Chakroun, Tom Vander Aa, Tom Ashby
Opublikowane w: 4th International Conference on Big Data Analytics, Data Mining and Computational Intelligence, 2019, Strona(/y) 31-38, ISBN 978-989-8533-92-0
Wydawca: IADIS

Bandwidth-Aware Page Placement in NUMA

Autorzy: David Gureya, João Neto, Reza Karimi, João Barreto, Pramod Bhatotia, Vivien Quema, Rodrigo Rodrigues, Paolo Romano, Vladimir Vlassov
Opublikowane w: Proceedings of the 34th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2020, Numer 2020, 2020
Wydawca: IEEE

Tasking in Accelerators: Performance Evaluation

Autorzy: Leonel Toledo, Antonio J. Pena, Sandra Catalan, Pedro Valero-Lara
Opublikowane w: 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2019, Strona(/y) 127-132, ISBN 978-1-7281-2616-6
Wydawca: IEEE
DOI: 10.1109/PDCAT46702.2019.00034

Static Graphs for Coding Productivity in OpenACC

Autorzy: Leonel Toledo; Pedro Valero-Lara; Jeffrey Vetter; Antonio J. Peña
Opublikowane w: Institute of Electrical and Electronics Engineers (IEEE), 2022, ISSN 2640-0316
Wydawca: IEEE
DOI: 10.1109/hipc53243.2021.00050

JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization

Autorzy: Matsumura, Kazuaki; García de Gonzalo, Simón; Peña Monferrer, Antonio José
Opublikowane w: Institute of Electrical and Electronics Engineers (IEEE), 2021, Strona(/y) 182-191, ISBN 978-1-6654-1016-8
Wydawca: Institute of Electrical and Electronics Engineers (IEEE)
DOI: 10.1109/hipc53243.2021.00032

Towards the Large-Eddy Simulation of a full engine: Integration of a 360 azimuthal degrees fan, compressor and combustion chamber. Part I: Methodology and initialisation

Autorzy: Pérez Arroyo C., Dombard J., Duchaine F., Gicquel L., Martin B., Odier N., and Staffelbach G
Opublikowane w: Journal of the Global Power and Propulsion Society. Special Numer: Data-Driven Modelling and High-Fidelity Simulations: 1–16, 2021, Strona(/y) 1-16, ISSN 2515-3080
Wydawca: Bentus Publishing
DOI: 10.33737/jgpps/133115

Towards Enhancing Coding Productivity for GPU Programming Using Static Graphs

Autorzy: Leonel Toledo, Pedro Valero-Lara, Jeffrey S. Vetter, and Antonio J. Peña
Opublikowane w: MDPI - Electronics, 2022, ISSN 2079-9292
Wydawca: MDPI - Electronics
DOI: 10.3390/electronics11091307

Asynchronous Runtime with Distributed Manager for Task-based Programming Models

Autorzy: Jaume Bosch, Carlos Álvarez, Daniel Jiménez-González, Xavier Martorell, Eduard Ayguadé
Opublikowane w: Parallel Computing Journal, 2020, ISSN 0167-8191
Wydawca: Elsevier BV
DOI: 10.1016/j.parco.2020.102664

Guidelines for enhancing data locality in selected machine learning algorithms

Autorzy: Imen Chakroun, Tom Vander Aa, Tomas J. Ashby
Opublikowane w: Intelligent Data Analysis, Numer 23/5, 2019, Strona(/y) 1003-1020, ISSN 1088-467X
Wydawca: Elsevier Science
DOI: 10.3233/ida-184287

Persistent Memory: A Survey of Programming Support and Implementations

Autorzy: Alexandro Baldassin, João Barreto, Daniel Castro, and Paolo Romano
Opublikowane w: ACM Computing Surveys, 2021, Strona(/y) No.: 152pp 1–37, ISSN 0360-0300
Wydawca: Association for Computing Machinary, Inc.
DOI: 10.1145/3465402

Parallelware Tools: An Experimental Evaluation on POWER Systems

Autorzy: Manuel Arenaz; Xavier Martorell
Opublikowane w: Springer, Numer 5, 2019, ISBN 978-3-030-34355-2
Wydawca: Springer
DOI: 10.1007/978-3-030-34356-9_27

Task-Based Programming Models for Heterogeneous Recurrent Workloads

Autorzy: Jaume Bosch; Miquel Vidal; Antonio Filgueras; Daniel Jiménez-González; Carlos Alvarez; Xavier Martorell; Eduard Ayguadé
Opublikowane w: International Symposium on Applied Reconfigurable Computing, Numer 2, 2021, ISBN 978-3-030-79024-0
Wydawca: Springer Nature
DOI: 10.1007/978-3-030-79025-7_8

Prawa własności intelektualnej

SYSTEMS AND METHODS FOR COHERENCE IN CLUSTERED CACHE HIERARCHIES

Numer wniosku/publikacji: 20 1615015274
Data: 2016-02-04
Wnioskodawca/wnioskodawcy: ETA SCALE AB

SYSTEMS AND METHODS FOR INVISIBLE SPECULATIVE EXECUTION

Numer wniosku/publikacji: 20 2016825399
Data: 2020-03-20
Wnioskodawca/wnioskodawcy: ETA SCALE AB

SYSTEM AND METHOD FOR EVENT MONITORING IN CACHE COHERENCE PROTOCOLS WITHOUT EXPLICIT INVALIDATIONS

Numer wniosku/publikacji: 15 705094
Data: 2015-01-02
Wnioskodawca/wnioskodawcy: ETA SCALE AB

SYSTEM AND METHOD FOR SELF-INVALIDATION, SELF-DOWNGRADE CACHECOHERENCE PROTOCOLS

Numer wniosku/publikacji: 20 1715855378
Data: 2017-12-27
Wnioskodawca/wnioskodawcy: ETA SCALE AB

SYSTEM AND METHOD FOR EVENT MONITORING IN CACHE COHERENCE PROTOCOLS WITHOUT EXPLICIT INVALIDATIONS

Numer wniosku/publikacji: 20 2016825399
Data: 2015-01-02
Wnioskodawca/wnioskodawcy: ETA SCALE AB

MULTI-CORE COMPUTER SYSTEMS WITH PRIVATE/SHARED CACHE LINE INDICATORS

Numer wniosku/publikacji: 20 1916291154
Data: 2019-03-04
Wnioskodawca/wnioskodawcy: ETA SCALE AB

SYSTEMS AND METHODS FOR NON-SPECULATIVE STORE COALESCING AND GENERATING ATOMIC WRITE SETS USING ADDRESS SUBSETS

Numer wniosku/publikacji: 20 1916388120
Data: 2019-04-18
Wnioskodawca/wnioskodawcy: ETA SCALE AB

SYSTEM AND METHOD FOR SELF-INVALIDATION, SELF-DOWNGRADE CACHECOHERENCE PROTOCOLS

Numer wniosku/publikacji: 20 2016825399
Data: 2019-12-11
Wnioskodawca/wnioskodawcy: ETA SCALE AB

Wyszukiwanie danych OpenAIRE...

Podczas wyszukiwania danych OpenAIRE wystąpił błąd

Brak wyników