Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Serverless Data Analytics Platform

Deliverables

Final dissemination, exploitation, and adoption report

Thorough description of all dissemination activities and specifically our impact on major open source projects It will also describe all exploitation activities in the different industrial partners and future plans for each product The adoption report will contain all main contributions of the partners to relevant open source projects both in terms of input documents submitted for consideration and software components integrated to the reference software implementations It will also describe the user communities and international adoption and impact of the results of the project

Full implementation of the BLOSSOM middleware

This final deliverable includes 1 a library of composable replicated data types and their support for Java as well as for a functional language running atop the JVM 2 several mechanisms to allow object composition and sharding while preserving correctness 3 a modular consensus layer able to adjust to the consistency of each object paying the price of synchronization only where neededThe final evaluation is made using a largescale serverless applications from one of the partners

Initial specs of the Serverless Compute and Execution Engine

This deliverable includes initial specifications and design of the Serverless Compute and Execution Engine. During the first six months of the project an assessment of the leading open source serverless technologies, such as Apache OpenWhisk, PyWren, OpenFaaS, nuclio, and other, will be performed. The results of this assessment will be reported in the specification and design document and an initial prototype will be built on the best of breed technology with the highest potential for impact.

CloudButton Initial API Definition

This deliverable includes a detailed definition of the exact abstractions and API we will be exposing. It will outline the methods by which users can mark up their code for parallelization as well pseudocode sketches of how all the custom data structures will behave. The deliverable will also provide initial observations on the complexity of the porting tools we have proposed, including experiments with static analysis and an audit of the required OpenMP and MPI functionality.

Serverless Compute Engine Design and Prototypes

This deliverable describes the Serverless Compute and Execution Engine software components. It also includes initial monitoring and instrumentation efforts to acquire telemetry and log data for the workloads across various services using real experiments. The collected data will be used to validate the software and to identify candidate performance improvements and I/O problems. Finally, it also includes a performance evaluation study on the data from the use cases.

Communication report

Description of the dissemination activities with lessons learned and progress reporting. It will also describe community involvement activities. Includes the initial version of the exploitation plan.

Communication plan

Definition of the required process and strategy for dissemination activities. Description of the planned dissemination activities and expected progress reporting. It will also describe planned community involvement activities.

CloudButton Architecture Specs and Early Prototypes

Specification of the Architecture and APIs. Documentation, early tutorials, and automated tests for the early prototypes of the different software components. First description and evaluation of results obtained from validation in use cases using different experiments and workloads.

Specification and partial support for degradable objects

This mid-term deliverable includes a full implementation of the client-side of BLOSSOM and a prototype of the modular consensus-based library. The server side of BLOSSOM is able to implement any user-defined replicated data type. It runs in at least into three degraded modes, including linearizability and update consistency. A library of replicated data types is available, some of which can be sharded if required by the programmer. At that stage, we plan a detailed evaluation of the prototype using a relevant middle-scale serverless application (e.g., data analytics).

CloudButton Toolkit Reference Implementation

This final deliverable encompasses a working implementation of the CloudButton toolkit and the associated porting tools It will primarily focus on demonstrating the performance and robustness of several detailed reference implementations along with indepth comparisons to their equivalent implementations in existing frameworksThe deliverable will also include examples of porting existing HPC applications written in Java into those that run on CloudButton both using our static analysis tools as well as our suite of libraries and standardised patterns The comparison between CloudButton and the current stateof theart here will be a key output of our work

Initial prototype for stateful serverless computation

This deliverable includes a detailed specification of the programming language support for Java and an initial prototype of the server side. This initial prototype relies on the Creson framework and uses Infinispan for data distribution, replication and persistence.

CloudButton Prototype of Abstractions, Fault-tolerance and Porting Tools

This deliverable depends on the Serverless Compute Engine design (D3.1), as well as the programming abstractions built to deal with mutable state (D4.1). It will include an implementation of a significant portion of the mark-up annotations and custom data structures, evaluated via two prototypes of non-trivial big data and machine-learning problems. These prototypes will demonstrate automatically parallelized Java code based on stateful dataflow graphs, processed and executed on CloudButton. A simple fault-tolerance mechanism and associated configuration mechanism will also be provided.

Reference implementation of architectural building blocks

Public release of stable software components of the CloudButton Toolkit Complete specification and APIs Final description and evaluation of results obtained from validation in use cases using different workloads

Experiments and Initial Specifications

Description of use case scenarios, experiments and benchmarking framework. Initial specifications of the architecture.

Serverless Compute Engine Reference Implementation

This deliverable includes the final design and reference implementation for all tasksDescription and specification of the the stable release of the system including all APIs and the software components that implement these APIs the admin toolset runtime APIs Execution Engine and resource scheduler This deliverable also presents the integrated system architecture and consolidated results on the performance achieved by the joint operation of monitoring and deployment tools It also includes tutorials to facilitate the adoption of the platform by thirdparty developers Finally it also includes the performance evaluation study on the data and experiments from the use cases

Public Project Website

A website will be developed in order to provide a continuous update about the project progress and the results obtained during it. All public deliverables and publications (Open Access) will be uploaded on the website.

Data Management Plan, 3rd Version

This deliverable presents the third version of the project Data Management Plan DMP It is submitted on Month 39 as a Final review of the CloudButton Data Management Plan

Data Management Plan, 2nd Version

This deliverable presents the second version of the project Data Management Plan (DMP). It is submitted on Month 18 as a Mid–Term review of the CloudButton Data Management Plan.

Data Management Plan, 1st version

First version of the data management plan. The different experiments, workloads, benchmarks, and results will be delivered as Open Research Data for the community. This deliverable will evolve during the lifetime of the project in order to present the status of the project’s reflections on data management.

Publications

MLLess: Achieving Cost Efficiency in Serverless Machine Learning Training

Author(s): Pablo Gimeno Sarroca, Marc Sánchez-Artigas
Published in: 2022
Publisher: arXiv
DOI: 10.48550/arxiv.2206.05786

Decentralize the feedback infrastructure!

Author(s): Pedro Garcia Lopez
Published in: 2020
Publisher: arXiv
DOI: 10.48550/arxiv.2010.03356

ServerMix: Tradeoffs and Challenges of Serverless Data Analytics

Author(s): García-López, Pedro; Sánchez-Artigas, Marc; Shillaker, Simon; Pietzuch, Peter; Breitgand, David; Vernik, Gil; Sutra, Pierre; Tarrant, Tristan; Ferrer, Ana Juan
Published in: 2019
Publisher: Cornell University

Serverless Predictions: 2021-2030

Author(s): Pedro Garcia Lopez, Aleksander Slominski, Michael Behrendt, Bernard Metzler
Published in: 2021
Publisher: arXiv

Using Biological Signals for Mass Recalibration of Mass Spectrometry Imaging Data

Author(s): Raphaël La Rocca, Christopher Kune, Mathieu Tiquet, Lachlan Stuart, Theodore Alexandrov, Edwin De Pauw, Loïc Quinton
Published in: 2020
Publisher: ChemRxiv
DOI: 10.26434/chemrxiv.12901679.v1

Transparent Serverless execution of Python multiprocessing applications

Author(s): Aitor Arjona, Gerard Finol, Pedro Garcia-Lopez
Published in: 2022
Publisher: arXiv

Serverless End Game: Disaggregation enabling Transparency

Author(s): García-López, Pedro; Slominski, Aleksander; Shillaker, Simon; Behrendt, Michael; Metzler, Barnard
Published in: 2020
Publisher: arXiv
DOI: 10.48550/arxiv.2006.01251

Please, do not decentralize the Internet with (permissionless) blockchains!

Author(s): Pedro Garcia Lopez, Alberto Montresor, Anwitaman Datta
Published in: 2019
Publisher: arXiv
DOI: 10.48550/arxiv.1904.13093

EGEON: Software-Defined Data Protection for Object Storage

Author(s): Raul Saiz-Laudo, Marc Sanchez-Artigas
Published in: 2022
Publisher: arXiv

Efficient replication via timestamp stability

Author(s): Vitor Enes, Carlos Baquero, Alexey Gotsman, Pierre Sutra
Published in: EuroSys '21: Proceedings of the Sixteenth European Conference on Computer Systems, 2021, Page(s) 178–193, ISBN 978-1-4503-8334-9
Publisher: Association for Computing Machinery
DOI: 10.1145/3447786.3456236

Triggerflow - trigger-based orchestration of serverless workflows

Author(s): Pedro García López, Aitor Arjona, Josep Sampé, Aleksander Slominski, Lionel Villard
Published in: Proceedings of the 14th ACM International Conference on Distributed and Event-based Systems, 2020, Page(s) 3-14, ISBN 9781450380287
Publisher: ACM
DOI: 10.1145/3401025.3401731

FaaS Orchestration of Parallel Workloads

Author(s): Daniel Barcelona-Pons, Pedro García-López, Álvaro Ruiz, Amanda Gómez-Gómez, Gerard París, Marc Sánchez-Artigas
Published in: WOSC '19: Proceedings of the 5th International Workshop on Serverless Computing, 2019, Page(s) 25-30, ISBN 978-1-4503-7038-7
Publisher: Association for Computing Machinery
DOI: 10.1145/3366623.3368137

Bringing scaling transparency to Proteomics applications with serverless computing

Author(s): Mariano Ezequiel Mirabelli, Pedro García-López, Gil Vernik
Published in: WoSC'20: Proceedings of the 2020 Sixth International Workshop on Serverless Computing, 2021, Page(s) 55–60, ISBN 978-1-4503-8204-5
Publisher: Association for Computing Machinery
DOI: 10.1145/3429880.3430101

A milestone for FaaS pipelines; object storage-vs VM-driven data exchange

Author(s): Germán T. Eizaguirre, Marc Sánchez-Artigas, Pedro García-López
Published in: Middleware '21: Proceedings of the 22nd International Middleware Conference: Demos and Posters, 2021, Page(s) 10-11, ISBN 978-1-4503-9154-2
Publisher: Association for Computing Machinery
DOI: 10.1145/3491086.3492472

State-Machine Replication for Planet-Scale Systems

Author(s): Vitor Enes, Carlos Baquero, Tuanir França Rezende, Alexey Gotsman, Matthieu Perrin, Pierre Sutra
Published in: EuroSys'20 : Fifteenth European Conference on Computer System, Issue Article No.: 24, 2020, Page(s) 1-15, ISBN 978-1-4503-6882-7
Publisher: Association for Computing Machinery
DOI: 10.1145/3342195.3387543

Serverless Elastic Exploration of Unbalanced Algorithms

Author(s): Gerard París; Pedro García-López; Marc Sánchez-Artigas
Published in: 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), 2020, ISBN 978-1-7281-8780-8
Publisher: Institute of Electrical and Electronics Engineers
DOI: 10.1109/cloud49709.2020.00033

Primula: a Practical Shuffle/Sort Operator for Serverless Computing

Author(s): Marc Sánchez-Artigas, Germán T. Eizaguirre, Gil Vernik, Lachlan Stuart, Pedro García-López
Published in: Middleware '20: Proceedings of the 21st International Middleware Conference Industrial Track, 2020, ISBN 978-1-4503-8201-4
Publisher: Association for Computing Machinery
DOI: 10.1145/3429357.3430522

J-NVM: Off-heap Persistent Objects in Java

Author(s): Anatole Lefort, Yohan Pipereau, Kwabena Amponsem, Pierre Sutra, Gaël Thomas
Published in: SOSP '21: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021, Page(s) 408–423, ISBN 978-1-4503-8709-5
Publisher: Association for Computing Machinery
DOI: 10.1145/3477132.3483579

Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing

Author(s): Simon Shillaker, Peter Pietzuch
Published in: USENIX Annual Technical Conference 2020, 2020, Page(s) 419-433, ISBN 978-1-939133-14-4
Publisher: USENIX Association

On the FaaS Track - Building Stateful Distributed Applications with Serverless Architectures

Author(s): Daniel Barcelona-Pons, Marc Sánchez-Artigas, Gerard París, Pierre Sutra, Pedro García-López
Published in: Proceedings of the 20th International Middleware Conference, 2019, Page(s) 41-54, ISBN 9781450370097
Publisher: ACM
DOI: 10.1145/3361525.3361535

Leaderless State-Machine Replication: Specification, Properties, Limits (Extended Version)

Author(s): Tuanir França Rezende, Pierre Sutra
Published in: DISC'20: 34th International Symposium on Distributed Computing, 2020, Page(s) 24:1--24:17, ISBN 978-3-95977-168-9
Publisher: Schloss Dagstuhl -- Leibniz-Zentrum fur Informatik
DOI: 10.4230/lipics.disc.2020.24

The serverless shell

Author(s): Aurèle Mahéo, Pierre Sutra, Tristan Tarrant
Published in: Middleware '21: Proceedings of the 22nd International Middleware Conference: Industrial Track, 2021, Page(s) 9-15, ISBN 978-1-4503-9152-8
Publisher: Association for Computing Machinery
DOI: 10.1145/3491084.3491426

Spatial Metabolomics and Imaging Mass Spectrometry in the Age of Artificial Intelligence

Author(s): Theodore Alexandrov
Published in: Annual Review of Biomedical Data Science, 2020, ISSN 2574-3414
Publisher: Annual Reviews
DOI: 10.1146/annurev-biodatasci-011420-031537

On the correctness of Egalitarian Paxos

Author(s): Pierre Sutra
Published in: Information Processing Letters, Issue 156, 2020, Page(s) 105901, ISSN 0020-0190
Publisher: Elsevier BV
DOI: 10.1016/j.ipl.2019.105901

Triggerflow: Trigger-based orchestration of serverless workflows

Author(s): Aitor Arjona. Pedro García-López, Josep Sampé, Aleksander Slominski, Lionel Villard
Published in: Future Generation Computer Systems, Issue Volume 124, 2021, Page(s) 215-229, ISSN 0167-739X
Publisher: Elsevier BV
DOI: 10.1016/j.future.2021.06.004

Benchmarking parallelism in FaaS platforms

Author(s): Daniel Barcelona-Pons, Pedro García-López
Published in: Future Generation Computer Systems, Issue Volume 124, 2021, Page(s) 268-284, ISSN 0167-739X
Publisher: Elsevier BV
DOI: 10.1016/j.future.2021.06.005

OffsampleAI: artificial intelligence approach to recognize off-sample mass spectrometry images

Author(s): Katja Ovchinnikova, Vitaly Kovalev, Lachlan Stuart & Theodore Alexandrov
Published in: BMC Bioinformatics, Issue 21, 2020, Page(s) 129, ISSN 1471-2105
Publisher: BioMed Central
DOI: 10.1186/s12859-020-3425-x

Outsourcing Data Processing Jobs with Lithops

Author(s): Josep Sampe, Marc Sanchez-Artigas, Gil Vernik, Ido Yehekzel, Pedro Garcia-Lopez
Published in: IEEE Transactions on Cloud Computing, 2021, ISSN 2168-7161
Publisher: Institute of Electrical and Electronics Engineers Inc.
DOI: 10.1109/tcc.2021.3129000

Stateful Serverless Computing with Crucial

Author(s): Daniel Barcelona-Pons, Pierre Sutra, Marc Sánchez-Artigas, Gerard París, Pedro García-López
Published in: ACM Transactions on Software Engineering and Methodology, Issue Volume 31, Issue 3, Article 39, 2022, Page(s) 1-38, ISSN 1049-331X
Publisher: Association for Computing Machinary, Inc.
DOI: 10.1145/3490386

A compressed file partitioner for scalable Genomics analysis with Serverless technology

Author(s): Francisco Damián Maleno González
Published in: 2021
Publisher: University Rovira i Virgili

Study of the Feasibility of Serverless Access Transparency for Python Multiprocessing Applications

Author(s): Gerard Finol Peñalver, Aitor Arjona Pérez
Published in: 2021
Publisher: University Rovira i Virgili

Serverless OCaml Genomic Pipeline Parallelisation Engine

Author(s): Gil Arasa Verge
Published in: 2022
Publisher: University Rovira i Virgili

Machine Learning on a Serverless Architecture

Author(s): Pablo Gimeno Sarroca
Published in: 2021
Publisher: University Rovira i Virgili

Painless Data Analytics in the Cloud. Grouping data in serverless architectures

Author(s): German Telmo Eizaguirre Suarez
Published in: 2021
Publisher: University Rovira i Virgili

Porting Genomics pipelines to the Cloud - Serverless Computing as an avenue for scalable variant calling

Author(s): Xavier Roca i Canals
Published in: 2022
Publisher: University Rovira i Virgili

Trade-Offs and Challenges of Serverless Data Analytics

Author(s): Pedro García-López, Marc Sánchez-Artigas, Simon Shillaker, Peter Pietzuch, David Breitgand, Gil Vernik, Pierre Sutra, Tristan Tarrant, Ana Juan-Ferrer & Gerard París
Published in: Technologies and Applications for Big Data Value, 2021, Page(s) 41-61, ISBN 978-3-030-78307-5
Publisher: Springer, Cham
DOI: 10.1007/978-3-030-78307-5_3

Searching for OpenAIRE data...

There was an error trying to search data from OpenAIRE

No results available