Skip to main content

Software Defined Storage for Big Data

Deliverables

IOStack Architecture Specs and Benchmarking framework

First specification of the IOStack Architecture and APIs. Description of use case scenarios and benchmarking framework.

Second Period Management Report

Second Period Management report including completion of deliverables and tasks, as well as resources and efforts spent by partners and description of activities in this period.

Communication plan for dissemination

Definition of the required process and strategy for dissemination activities.

Public release of the IOStack Toolkit

Public release of stable IOStack prototypes. Complete specification and APIs of the IOStack toolkit. First description and evaluation of results obtained from validation in use cases using different workloads.

Stable release and specifications of the SDS framework for analytics

Description and specification of the the stable release of the system including all APIs and the software components that implement these APIs: the admin toolset, collector, provisioner, and the storage offload.

System object model design

This deliverable describes the object model of the compute and storage system. This system object model is the dataset that the SDS framework helper functions will operate on. All provisioning requests from the SDS framework will use the system object model. The system object model is implemented as a database which requires setup. The database is created in two steps, step 1 using the Admin toolset which allows a high level abstraction of system containers with associated policies to be created, step 2 is to map physical assets to the logical groups and define the properties of those assets. The provisioning tools then use this database to make provisioning requests. This deliverable also contains the complete set of APIs to access the SDS system object model from other components developed in other workpackages.

Data Management Plan

First version of the data management plan describing project management strategies. The different experiments, workloads, benchmarks, and results will be delivered as Open Research Data for the community. This deliverable will evolve during the lifetime of the project in order to present the status of the project’s reflections on data management.

SDS Toolkit initial prototype

Initial implementation of several components that implements the API specification. The system object model contains run time data used by the provisioning tools to decide how to optimally make a provisioning request. Specifically, the collector is responsible for collector run time data which is used in the provisioning process. The collector is also responsible for discovery and building the topology of the participating nodes and resources. This deliverable also receives the SDS Rest API provisioning request, scans the system object model and uses policies, heuristics, topology and run time data to make an optimal decision as to which resource should be used to satisfy the provisioning request. The system design will allow for the possibility of different heuristics (ex. Optimal speed, minimal cost etc). This deliverable also implements the API functions to run a compute operation on the storage node allowing the analytics process to be implemented with both compute based and storage based compute operations.

First Period Management Report

First Period Management report including completion of deliverables and tasks, as well as resources and efforts spent by partners and description of activities in this period.

Prototype and initial evaluation

This deliverable presents an implementation of the storlets for analytics framework and evaluates initial performance. Also research results and/or implementations are presented for the data reduction and performance optimization tasks.

System Deployment Strategies

This deliverable presents the system model, optimization objectives and the corresponding algorithms to find suitable deployment strategies that satisfy the constraints imposed by the model. Such deployment strategies and algorithms determine, to a large extent, the system deployment tools developed in the project.

Summary and demonstration of results

This deliverable summarizes the work done in the work package an demonstrates the final implementations and research results.

Community involvement, workshop, and dissemination report

Description of the Year 2 dissemination activities with lessons learned and progress reporting.

Design and implementation progress report

Outline the design and initial implementation of the Storlet implementation and directions and goals for data reduction and optimization tasks.

Consolidated System Monitoring and Deployment Tools

This deliverable presents the integrated system architecture and consolidated results on the performance achieved by the joint operation of monitoring and deployment tools.

System Monitoring Design and Preliminary Evaluation

This deliverable describes the architecture of the monitoring tools required to define system models, which allow to reason about deployment strategies, and presents preliminary results on the system performance when DISC frameworks are deployed according to naive strategies.

Reference implementation of architectural building blocks

Final reference implementation of the IOStack toolkit. Complete documentation including tutorials and how-to documents. Final evaluation of results obtained from validation in the different use cases.

Final dissemination, exploitation, and standardization report

Final Report explaining dissemination and collaboration activities.

Searching for OpenAIRE data...

Publications

Improving OpenStack Swift interaction with the I/O Stack to enable Software Defined Storage

Author(s): Ramon Nou, Alberto Miranda, Marc Siquier, Toni Cortes
Published in: Proceedings of the 7th IEEE International Symposium on Cloud and Service Computing, 2017

Stocator - a high performance object store connector for spark

Author(s): Gil Vernik, Michael Factor, Elliot K. Kolodner, Effi Ofer, Pietro Michiardi, Francesco Pace
Published in: Proceedings of the 10th ACM International Systems and Storage Conference on - SYSTOR '17, 2017, Page(s) 1-1
DOI: 10.1145/3078468.3078496

Flexible Scheduling of Distributed Analytic Applications

Author(s): Francesco Pace, Daniele Venzano, Damiano Carra, Pietro Michiardi
Published in: 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2017, Page(s) 100-109
DOI: 10.1109/ccgrid.2017.52

Stocator - an object store aware connector for apache spark

Author(s): Gil Vernik, Michael Factor, Elliot K. Kolodner, Effi Ofer, Pietro Michiardi, Francesco Pace
Published in: Proceedings of the 2017 Symposium on Cloud Computing - SoCC '17, 2017, Page(s) 653-653
DOI: 10.1145/3127479.3134761

Crystal: Software-Defined Storage for Multi-Tenant Object Stores

Author(s): Raúl Gracia-Tinedo, Josep Sampé, Edgar Zamora, Marc Sánchez-Artigas, Pedro García-López, Yosef Moatti, Eran Rom
Published in: 15th USENIX Conference on File and Storage Technologies, FAST 2017, 2017, Page(s) 243-256

Too Big to Eat: Boosting Analytics Data Ingestion from Object Stores with Scoop

Author(s): Yosef Moatti, Eran Rom, Raul Gracia-Tinedo, Dalit Naor, Doron Chen, Josep Sampe, Marc Sanchez-Artigas, Pedro Garcia-Lopez, Filip Gluszak, Eric Deschdt, Francesco Pace, Daniele Venzano, Pietro Michiardi
Published in: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), 2017, Page(s) 309-320
DOI: 10.1109/icde.2017.243

Data-driven serverless functions for object storage

Author(s): Josep Sampé, Marc Sánchez-Artigas, Pedro García-López, Gerard París
Published in: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference on - Middleware '17, 2017, Page(s) 121-133
DOI: 10.1145/3135974.3135980

CRESON: Callable and Replicated Shared Objects over NoSQL

Author(s): Pierre Sutra, Etienne Riviere, Cristian Cotes, Marc Sanchez Artigas, Pedro Garcia Lopez, Emmanuel Bernard, William Burns, Galder Zamarreno
Published in: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), 2017, Page(s) 115-128
DOI: 10.1109/icdcs.2017.239

Oblivious RAM as a Substrate for Cloud Storage -- The Leakage Challenge Ahead

Author(s): Marc Sánchez-Artigas
Published in: Proceedings of the 2016 ACM on Cloud Computing Security Workshop - CCSW '16, 2016, Page(s) 49-53
DOI: 10.1145/2996429.2996430

Random Feature Expansions for Deep Gaussian Processes

Author(s): Kurt Cutajar, Edwin V. Bonilla, Pietro Michiardi, Maurizio Filippone
Published in: Proceedings of the 34th International Conference on Machine Learning, Issue 70, 2017, Page(s) 884-893, ISSN 1938-7228

TallyNetworks: Protecting Your Private Opinions with Edge-centric Computing

Author(s): Marc Ruiz Rodríguez, Pedro García-López, and Marc Sánchez-Artigas
Published in: LSDVE 2016, 4th Workshop on Large Scale Distributed Virtual Environments, Issue To appear, 2016

Experimental Performance Evaluation of Cloud-Based Analytics-as-a-Service

Author(s): Francesco Pace, Marco Milanesio, Daniele Venzano, Damiano Carra, Pietro Michiardi
Published in: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), 2016, Page(s) 196-203
DOI: 10.1109/cloud.2016.0035

Understanding Data Sharing in Private Personal Clouds

Author(s): Raul Gracia-Tinedo, Pedro Garcia-Lopez, Alberto Gomez, Anastasio Illana
Published in: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), 2016, Page(s) 392-399
DOI: 10.1109/cloud.2016.0059

Vertigo: Programmable Micro-controllers for Software-Defined Object Storage

Author(s): Josep Sampe, Pedro Garcia-Lopez, Marc Sanchez-Artigas
Published in: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), 2016, Page(s) 180-187
DOI: 10.1109/cloud.2016.0033

SDGen: Mimicking Datasets for Content Generation in Storage Benchmarks

Author(s): Raúl Gracia-Tinedo, Danny Harnik, Dalit Naor, Dmitry Sotnikov, Sivan Toledo and Aviad Zuck
Published in: 13th USENIX Conference on File and Storage Technologies (FAST 15), 2015, Page(s) 317-330

Dissecting UbuntuOne - Autopsy of a Global-scale Personal Cloud Back-end

Author(s): Raúl Gracia-Tinedo, Yongchao Tian, Josep Sampé, Hamza Harkous, John Lenton, Pedro García-López, Marc Sánchez-Artigas, Marko Vukolic
Published in: Proceedings of the 2015 ACM Conference on Internet Measurement Conference - IMC '15, 2015, Page(s) 155-168
DOI: 10.1145/2815675.2815677

Giving wings to your data: A first experience of Personal Cloud interoperability

Author(s): Raúl Gracia-Tinedo, Cristian Cotes, Edgar Zamora-Gómez, Genís Ortiz, Adrián Moreno-Martínez, Marc Sánchez-Artigas, Pedro García-López, Raquel Sánchez, Alberto Gómez, Anastasio Illana
Published in: Future Generation Computer Systems, Issue 78, 2018, Page(s) 1055-1070, ISSN 0167-739X
DOI: 10.1016/j.future.2017.01.027

RSD: Rate-Based Sync Deferment for Personal Cloud Storage Services

Author(s): Raul Saiz-Laudo, Marc Sanchez-Artigas, Pedro Garcia-Lopez
Published in: IEEE Communications Letters, Issue 21/11, 2017, Page(s) 2384-2387, ISSN 1089-7798
DOI: 10.1109/lcomm.2017.2731848

NG-DBSCAN: Scalable Density-Based Clustering for Arbitrary Data

Author(s): Alessandro Lulli, Matteo Dell’Amico, Pietro Michiardi, Laura Ricci
Published in: Proceedings of the VLDB Endowment, Issue 10/3, 2016, Page(s) 157-168, ISSN 2150-8097

Enhancing Tree-Based ORAM Using Batched Request Reordering

Author(s): Marc Sanchez-Artigas
Published in: IEEE Transactions on Information Forensics and Security, Issue 13/3, 2018, Page(s) 590-604, ISSN 1556-6013
DOI: 10.1109/tifs.2017.2762824

The power of swarming in personal clouds under bandwidth budget

Author(s): Rahma Chaabouni, Marc Sánchez-Artigas, Pedro García-López, Lluís Pàmies-Juàrez
Published in: Journal of Network and Computer Applications, Issue 65, 2016, Page(s) 48-71, ISSN 1084-8045
DOI: 10.1016/j.jnca.2016.02.006

IOStack: Software-Defined Object Storage

Author(s): Raul Gracia-Tinedo, Pedro Garcia-Lopez, Marc Sanchez-Artigas, Josep Sampe, Yosef Moatti, Eran Rom, Dalit Naor, Ramon Nou, Toni Cortes, William Oppermann, Pietro Michiardi
Published in: IEEE Internet Computing, Issue 20/3, 2016, Page(s) 10-18, ISSN 1089-7801
DOI: 10.1109/MIC.2016.46