Skip to main content
European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS

STREAMLINE

Rezultaty

Annual Report, Quality Assurance and Evaluation Period 1

This deliverable will present a summary of the activities carried out in Y1, telling a coherent story of the work produced, referring to detailed accounts in the respective deliverables. It will additionally detail the Quality Management and Control policies of the project and explain how they were enforced in the work leading to and in the production of each of the deliverables of Y1. Finally, it will measure success through the evaluation of the measurable outcomes set out in each of the tasks described in this Description of Action, using appropriate key performance indicators.

Flink Real Time Stream Mining Library v1

Version 1 of the Flink Real Time Stream Mining Library with evaluation measurements over use case partner data. Basic classification, regression and recommendation methods for combined batch and stream machine learning based on linear models and stochastic gradient descent, also involving low memory synopses for sublinear storage of long-term updatable model components. Baseline measures defined for WP2.

Design and Implementation v1

First iteration of the design defined and implementation carried out in T5.1, T5.2, T5.3.

Annual Report, Quality Assurance and Evaluation Period 2

This deliverable will present a summary of the activities carried out in Y2, telling a coherent story of the work produced, referring to detailed accounts in the respective deliverables. It will additionally detail the Quality Management and Control policies of the project and explain how they were enforced in the work leading to and in the production of each of the deliverables of Y2. Finally, it will measure success of Y2 activities through the evaluation of the measurable outcomes set out in each of the tasks described in this Description of Action, using appropriate key performance indicators.

Combined Data at Rest and Data in Motion Analysis Platform v2

As with all versions of the platform, it will be evaluated using the use case partner data. Delivery plans for M22 (Y2): an advanced demonstration of our platform, i.e. V2, with a larger set of optimization features, operators for unified batch-stream processing, and limited fault tolerance and incremental computation support.

Flink Real Time Stream Mining Library v3

Version v3 of the Flink Real Time Stream Mining Library with evaluation measurements over use case partner data. Final version of the online machine learning package tested and evaluated over WP4-5 business cases against Y1 baselines.

Design and Implementation v3

Third iteration of the design defined and implementation carried out in T5.1, T5.2, T5.3.

Project Plan Period 1

A detailed plan of the activities to be carried during the first year.

Combined Data at Rest and Data in Motion Analysis Platform v3

As with all versions of the platform, it will be evaluated using the use case partner data. Delivery plans for M34 (Y3): a full-version platform V3 with all tasks implemented, tested and evaluated over WP4-5 business cases against Y1 baselines and competitor products.

Status report on dissemination activities Period 1

Detailed description of dissemination results achieved during Y1 of the project.

Dissemination Roadmap & Project Website

Define the expected project outputs, dissemination and communication activities to be developed during the entire duration of the project. Launch the project website with basic information on the project -- project goals, consortium composition, use cases descriptions -- which will then be updated continuously throughout the duration of the project.

Use case report for actionable knowledge extraction from text information

A report on the extracting actionable knowledge from advanced text data mining using various machine learning algorithms, such as passive-agressive.

Design and Implementation v2

Second iteration of the design defined and implementation carried out in T5.1, T5.2, T5.3.

Combined Data at Rest and Data in Motion Analysis Platform v1

As with all versions of the platform will be evaluated using the use case partner data. Delivery plans for M10 (Y1): a specification document and a basic demo platform, i.e. V1, with a) subset of query optimization features (like operator chaining) and b) primitive operators necessary for analyzing data at rest and data in motion together.

Status report on dissemination activities Period 2

Detailed description of dissemination results achieved during Period 2 of the project.

Project Plan Period 2

A detailed plan of the activities to be carried during the second year.

Flink Real Time Stream Mining Library v2

Version 2 of the Flink Real Time Stream Mining Library with evaluation measurements over use case partner data. Advanced methods, depending on use cases, potentially including gradient boosted trees, kernel methods, implicit and explicit ALS and tensor factorization, differential privacy and peer-to-peer recommenders.

A high level declarative language for ML

A programming model to express different use-cases in our high-level language, and an easy to use declarative language for using ML algorithms on massive dataset.

Field trials and Evaluation v1

First iteration of the field trials and evaluation carried out in T5.4.

Flink deployment software

A deployment tool for automatic installation of Flink on a cluster. It consists of the Chef cookbooks on Karamel for the Flink stack.

Flink interactive environment

An interactive data analytics tool for Apache Flink that consists of (i) a REPL or language shell with an interactive environment that takes a user inputs, evaluates them, and returns the result to the user quickly, and (ii) a web-based environment, based on Zeppelin, that enables interactive data analyses.

Flink on Hops/Hadoop

Extension and revision of D3.1 addressing integration of Apache Flink into Hops/Hadoop ecosystem

Field Trials and Evaluation v2

Second iteration of the field trials and evaluation carried out in T5.4.

Field Trials and Implementation v3

Third iteration of the field trials and evaluation carried out in T5.4.

Publikacje

Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing

Autorzy: Philipp M. Grulich, René Saitenmacher, Jonas Traub, Sebastian Breß, Tilmann Rabl, Volker Markl
Opublikowane w: 21st International Conference on Extending Database Technology (EDBT), 2018, 2018, ISBN 978-3-89318-078-3
Wydawca: Open Proceedings
DOI: 10.5441/002/edbt.2018.51

Optimized on-demand data streaming from sensor nodes

Autorzy: Jonas Traub, Sebastian Breß, Tilmann Rabl, Asterios Katsifodimos, Volker Markl
Opublikowane w: Proceedings of the 2017 Symposium on Cloud Computing - SoCC '17, 2017, Strona(/y) 586-597, ISBN 9781-450350280
Wydawca: ACM Press
DOI: 10.1145/3127479.3131621

STREAMLINE - Streamlined Analysis of Data at Rest and Data in Motion

Autorzy: Philipp M. Grulich, Tilmann Rabl, Volker Markl, Csaba Sidló, Andras Benczur
Opublikowane w: 20th International Conference on Extending Database Technology (EDBT), 2017, 2017
Wydawca: CEUR Workshop Proceedings

I2: Interactive Real-Time Visualization for Streaming Data

Autorzy: Jonas Traub, Nikolaas Steenbergen, Philipp M. Grulich, Tilmann Rabl, Volker Markl
Opublikowane w: 20th International Conference on Extending Database Technology (EDBT), 2017, 2017, ISBN 978-3-89318-073-8
Wydawca: Open Proceedings
DOI: 10.5441/002/edbt.2017.61

Bridging the gap - towards optimization across linear and relational algebra

Autorzy: Andreas Kunft, Alexander Alexandrov, Asterios Katsifodimos, Volker Markl
Opublikowane w: Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond - BeyondMR '16, 2016, Strona(/y) 1-4, ISBN 9781-450343114
Wydawca: ACM Press
DOI: 10.1145/2926534.2926540

Emma in Action - Declarative Dataflows for Scalable Data Analysis

Autorzy: Alexander Alexandrov, Andreas Salzmann, Georgi Krastev, Asterios Katsifodimos, Volker Markl
Opublikowane w: Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16, 2016, Strona(/y) 2073-2076, ISBN 9781-450335317
Wydawca: ACM Press
DOI: 10.1145/2882903.2899396

Benchmarking Distributed Stream Data Processing Systems

Autorzy: Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, Volker Markl
Opublikowane w: 2018 IEEE 34th International Conference on Data Engineering (ICDE), 2018, Strona(/y) 1507-1518, ISBN 978-1-5386-5520-7
Wydawca: IEEE
DOI: 10.1109/ICDE.2018.00169

Efficient Window Aggregation with General Stream Slicing

Autorzy: Jonas Traub Philipp Grulich, Alejandro Rodríguez Cuéllar Sebastian Breß Asterios Katsifodimos Tilmann Rabl Volker Markl
Opublikowane w: 22nd International Conference on Extending Database Technology (EDBT), 2019, 2019
Wydawca: Open Proceedings

Continuous Deployment of Machine Learning Pipelines

Autorzy: Behrouz Derakhshan, Alireza Rezaei Mahdiraji, Tilmann Rabl, and Volker Markl
Opublikowane w: 22nd International Conference on Extending Database Technology (EDBT), 2019, 2019
Wydawca: Open Proceedings

Tutorial on Open Source Online Learning Recommenders

Autorzy: Róbert Pálovics, Domokos Kelen, András A. Benczúr
Opublikowane w: Proceedings of the Eleventh ACM Conference on Recommender Systems - RecSys '17, 2017, Strona(/y) 400-401, ISBN 9781-450346528
Wydawca: ACM Press
DOI: 10.1145/3109859.3109937

Alpenglow: Open Source Recommender Framework with Time-aware Learning and Evaluation

Autorzy: Erzsébet Frigó, Róbert Pálovics, Domokos Kelen, Levente Kocsis, András A. Benczúr
Opublikowane w: RecSys 2017 poster, 2017
Wydawca: ACM

Online ranking prediction in non-stationary environments

Autorzy: Erzsébet Frigó, Róbert Pálovics, Domokos Kelen, Levente Kocsis, András A. Benczúr
Opublikowane w: RecTemp 2017 – workshop on reasoning on temporal aspects in user modeling in conjunction with RecSys 2017, 2017
Wydawca: ACM

Tracing Distributed Data Stream Processing Systems

Autorzy: Zoltan Zvara, Peter G.N. Szabo, Gabor Hermann, Andras Benczur
Opublikowane w: 2017 IEEE 2nd International Workshops on Foundations and Applications of Self* Systems (FAS*W), 2017, Strona(/y) 235-242, ISBN 978-1-5090-6558-5
Wydawca: IEEE
DOI: 10.1109/fas-w.2017.153

Efficient K-NN for Playlist Continuation

Autorzy: Domokos M. Kelen, Dániel Berecz, Ferenc Béres, András A. Benczúr
Opublikowane w: Proceedings of the ACM Recommender Systems Challenge 2018 on - RecSys Challenge '18, 2018, Strona(/y) 1-4, ISBN 9781-450365864
Wydawca: ACM Press
DOI: 10.1145/3267471.3267477

Cutty - Aggregate Sharing for User-Defined Windows

Autorzy: Paris Carbone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, Volker Markl
Opublikowane w: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16, 2016, Strona(/y) 1201-1210, ISBN 9781-450340731
Wydawca: ACM Press
DOI: 10.1145/2983323.2983807

Benchmarking Data Flow Systems for Scalable Machine Learning

Autorzy: Christoph Boden, Andrea Spina, Tilmann Rabl, Volker Markl
Opublikowane w: Proceedings of the 4th Algorithms and Systems on MapReduce and Beyond - BeyondMR'17, 2017, Strona(/y) 1-10, ISBN 9781-450350198
Wydawca: ACM Press
DOI: 10.1145/3070607.3070612

Query Centric Partitioning and Allocation for Partially Replicated Database Systems

Autorzy: Tilmann Rabl, Hans-Arno Jacobsen
Opublikowane w: Proceedings of the 2017 ACM International Conference on Management of Data - SIGMOD '17, 2017, Strona(/y) 315-330, ISBN 9781-450341974
Wydawca: ACM Press
DOI: 10.1145/3035918.3064052

From BigBench to TPCx-BB: Standardization of a Big Data Benchmark

Autorzy: Paul Cao, Bhaskar Gowda, Seetha Lakshmi, Chinmayi Narasimhadevara, Patrick Nguyen, John Poelman, Meikel Poess, Tilmann Rabl
Opublikowane w: Performance Evaluation and Benchmarking. Traditional - Big Data - Interest of Things, Numer 10080, 2017, Strona(/y) 24-44, ISBN 978-3-319-54333-8
Wydawca: Springer International Publishing
DOI: 10.1007/978-3-319-54334-5_3

A survey of state management in big data processing systems

Autorzy: Quoc-Cuong To, Juan Soto, Volker Markl
Opublikowane w: The VLDB Journal, Numer 27/6, 2018, Strona(/y) 847-872, ISSN 1066-8888
Wydawca: Springer Verlag
DOI: 10.1007/s00778-018-0514-9

Blockjoin: efficient matrix partitioning through joins

Autorzy: Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Tilmann Rabl, Volker Markl
Opublikowane w: Proceedings of the VLDB Endowment - Proceedings of the 43rd International Conference on Very Large Data Bases, Numer 10/13, 2017, Strona(/y) 2061-2072, ISSN 2150-8097
Wydawca: VLDB Endowment

Temporal walk based centrality metric for graph streams

Autorzy: Ferenc Béres, Róbert Pálovics, Anna Oláh, András A. Benczúr
Opublikowane w: Applied Network Science, Numer 3/1, 2018, ISSN 2364-8228
Wydawca: Springer Open
DOI: 10.1007/s41109-018-0080-5

Towards Streamlined Big Data Analytics

Autorzy: András A. Benczúr, Róbert Pálovics, Márton Balassi, Volker Markl, Tilmann Rabl, Juan Soto, Björn Hovstadius, Jim Dowling,Seif Haridi
Opublikowane w: ERCIM News, Numer 107, 2016, Strona(/y) 31-32
Wydawca: ERCIM EEIG

Online Machine Learning in Big Data Streams

Autorzy: András A. Benczúr, Levente Kocsis, Róbert Pálovics
Opublikowane w: 2018
Wydawca: MTA SZTAKI

Wyszukiwanie danych OpenAIRE...

Podczas wyszukiwania danych OpenAIRE wystąpił błąd

Brak wyników