Skip to main content

Scalable online machine learning for predictive analytics and real-time interactive visualization

Deliverables

Final demonstrator

Final demonstrator together with the final report including the evaluation of the whole technology developed during the project in the Hot Strip Mill process in the ArcelorMittal steelmaking factory.

Hybrid computation tested system

Tested system implementation of hybrid computation for Apache Flink

Second prototype (V2)

Second version of the above, for the second prototype

Optimizer Prototype

Prototype of a domain specific optimizer for the declarative language and Apache Flink

Scalable online machine learning algorithms for streaming

This deliverable will introduce Version 1 of SOLMA that encompasses new scalable online machine learning algorithms.

Optimizer finished implementation

Finished implementation of a domain specific optimizer for the declarative language and Apache Flink.

Updateable-state management prototype implementation

Implemented system for updateable state for Apache Flink

Third prototype (V3)

Third version of the above, for the third prototype

Software implementation and integration with Apache Flink

This deliverable includes the implementation of the 3 layers of the proposed technical solution. Data Collector, and Incremental Analytics Engine layers will be implemented within the core of Apache Flink technology. The Visualization layer will be implemented as client-side library

Basic scalable streaming algorithms

This deliverable is in the form of software (joint with publications) will present Version 0 of the library covering a set of basic scalable streaming algorithms produced in Task 4.2.

Scalable drift and anomaly detection

This deliverable will result in Version 2 of SOLMA covering new scalable drift and anomaly detection algorithms.

Hybrid computation prototype implementation

Prototypical implementation of hybrid computation for Apache Flink covering basic workflows

First prototype (V1)

The first version of the evolving prototype in the validation scenario. An associated evolving document will provide, for each prototype execution, the objectives definition, KPIs involved and their evaluation after the prototype execution phase.

Declarative language tested implementation

Tested implementation of a declarative language for (online) machine learning

Declarative language finished implementation

Finished implementation of a declarative language for (online) machine learning

Scalable Online algorithms in Flink

This deliverable will release the final implementation in Flink of the streaming algorithms produced earlier through D4.2-D4.4.

Declarative language prototype implementation

Implementation of a basic declarative language prototype

Report on scientific dissemination activities – V1

Details for scientific dissemination activities and materials along with the time line and success indicators. It includes a record of activities related to scientific dissemination that have been undertaken during the first half of the project, and those planned for the second period.

Scenario details and objectives description

This document details the Hot Strip Mill process in terms of sensor data characteristics and data workflow. It also describes the scenario objectives from the end-user perspective.

Report on community engagement and technology transfer activities – V2

The final version of the deliverable compiles a record of all the activities related to community engagement and technology transfer developed in the course of the project

Scenario development and KPI definition for the PROTEUS solution

A report that presents the review of benchmarks, the typical scenarios used to define the parameters of the PROTEUS solution and requirements, benchmarks and KPIs

PROTEUS evaluation and impact assessment

A report which details the gains associated with the PROTEUS solution, using quantitative information, and which identifies areas for further improvement and investment

Guidelines for interacting and visualization information in Big Data environments

This document presents the results of the research in new ways of presenting and working with large amount of data and stream data

Visualization requirements for massive online machine learning strategies

This deliverable defines functional and non-functional requirements for the visualization system regarding online machine learning strategies

Report on project communication and engagement activities – V2

The final report of communication and engagement activities, compiling a list of all activities developed for communication with other relevant initiatives in the course of the project.

Report on scientific dissemination activities – V2 [

Final report of scientific dissemination activities. The final version compiles a record of all activities related to scientific dissemination developed in the course of the project.

Catalogue of scientific and technical requirements

This document describes the catalogue of scientific and technical challenges/requirements derived from the industrial scenario needs.

Report on community engagement and technology transfer activities – V1

Details for the community engagement and technology transfer strategy for the project. The intermediate report includes a record of activities related to community creation and engagement, and technology transfer developed in the course of the first half of the project, and those planned for the second half

Declarative language syntax definition

Syntax definition for a declarative language based on machine learning requirements

Architecture design for supporting incremental visual methods

This deliverable defines the technical design of the 3-layer based architecture for implementing the visualization system

Report on project communication and engagement activities – V1

Details for communication and engagement activities and materials along with the time line and success indicators. It includes a record of communication activities that have been undertaken during the first half of the project, and those planned for the second period.

Investigative overview of targeted techniques and algorithms

The state of the art of scalable streaming algorithms for distributed environments, non-scalable streaming algorithms, and selected prominent non-streaming and non-scalable algorithms that can be approximated by an online version.

PROTEUS factsheet leaflet

The PROTEUS factsheet will be an early dissemination leaflet for dissmeination and communication purposes, including the most relevant information of the project in a nutshell, and will be available from the very begining as an initial public brochure.

PROTEUS project website

PROTEUS project public website, to be active and regularly updated during the whole project.

Searching for OpenAIRE data...

Publications

Efficient Migration of Very Large Distributed State for Scalable Streaming Processing

Author(s): Bonaventura Del Monte
Published in: Proceedings of the VLDB 2017 PhD Workshop, Issue 28 August 2017, 2017

Non-dominated solutions visualization in multiobjective optimization: application to assembly line balancing

Author(s): Krzysztof Trawinski, Manuel Chica, David P. Pancho, Sergio Damas, and Oscar Cordón
Published in: Proceeding of the MIC and MAEB 2017 Conferences, Issue June 2017, 2017, Page(s) 963-972

Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing

Author(s): Jonas Traub, Philipp Marian Grulich, Alejandro Rodriguez Cuellar, Sebastian Bress, Asterios Katsifodimos, Tilmann Rabl, Volker Markl
Published in: 2018 IEEE 34th International Conference on Data Engineering (ICDE), 2018, Page(s) 1300-1303
DOI: 10.1109/ICDE.2018.00135

Scalable online learning for flink - SOLMA library

Author(s): W. Jamil, N-C. Duong, W. Wang, C. Mansouri, S. Mohamad, A. Bouchachia
Published in: Proceedings of the 12th European Conference on Software Architecture Companion Proceedings - ECSA '18, 2018, Page(s) 1-4
DOI: 10.1145/3241403.3241438

Benchmarking Distributed Stream Data Processing Systems

Author(s): Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev, Henri Heiskanen, Volker Markl
Published in: 2018 IEEE 34th International Conference on Data Engineering (ICDE), 2018, Page(s) 1507-1518
DOI: 10.1109/ICDE.2018.00169

Aggregation Algorithm Vs. Average for Time Series Prediction

Author(s): Bouchachia, Abdelhamid; Kalnishkan, Y; Jamil, W.
Published in: ECML/PKDD 2016 Workshop on Large-scale Learning from Data Streams in Evolving Environments (STREAMEVOLV-2016), Issue 1, 2016, Page(s) 69-82

Bridging the gap: towards optimization across linear and relational algebra

Author(s): Andreas Kunft, Alexander Alexandrov, Asterios Katsifodimos, Volker Markl
Published in: BeyondMR '16 Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond, Issue BeyondMR '16 26-06-2016, 2016
DOI: 10.1145/2926534.2926540

Emma in Action: Declarative Dataflows for Scalable Data Analysis

Author(s): Alexander Alexandrov , Andreas Salzmann , Georgi Krastev , Asterios Katsifodimos , Volker Markl
Published in: ACM SIGMOD '16 Proceedings of the 2016 SIGMOD International Conference on Management of Data, Issue Sigmod16, 26-06-2016, 2016, Page(s) 2073-2076
DOI: 10.1145/2882903.2899396

Implicit Parallelism through Deep Language Embedding

Author(s): Alexander Alexandrov, Asterios Katsifodimos, Georgi Krastev, Volker Markl
Published in: ACM SIGMOD Record, Issue Volume 45, Number 1, March 2016, 2016, Page(s) 51-58, ISSN 0163-5808
DOI: 10.1145/2949741.2949754

An Incremental Approach for Real-Time Big Data Visual Analytics

Author(s): Ignacio Garcia, Ruben Casado, Abdelhamid Bouchachia
Published in: 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), 2016, Page(s) 177-182
DOI: 10.1109/W-FiCloud.2016.46

A non-parametric hierarchical clustering model

Author(s): Saad Mohamad, Abdelhamid Bouchachia, Moamar Sayed-Mouchaweh
Published in: 2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS), Issue STREAMEVOLV-2016, 23 September 2016, 2015, Page(s) 1-7
DOI: 10.1109/EAIS.2015.7368803

Active Learning for Data Streams under Concept Drift and concept evolution

Author(s): Saad Mohamad, Moamar Sayed-Mouchaweh and Abdelhamid Bouchachia
Published in: ECML/PKDD 2016 Workshop on Large-scale Learning from Data Streams in Evolving Environments, Issue STREAMEVOLV-2016, 23 September 2016, 2016, Page(s) 51-68

LIBIRWLS: A parallel IRWLS library for full and budgeted SVMs

Author(s): Roberto Díaz-Morales, Ángel Navia-Vázquez
Published in: Knowledge-Based Systems, Issue 136, 2017, Page(s) 183-186, ISSN 0950-7051
DOI: 10.1016/j.knosys.2017.09.007

Batch-based active learning: Application to social media data for crisis management

Author(s): Daniela Pohl, Abdelhamid Bouchachia, Hermann Hellwagner
Published in: Expert Systems with Applications, Issue 93, 2018, Page(s) 232-244, ISSN 0957-4174
DOI: 10.1016/j.eswa.2017.10.026

Active learning for classifying data streams with unknown number of classes

Author(s): Saad Mohamad, Moamar Sayed-Mouchaweh, Abdelhamid Bouchachia
Published in: Neural Networks, Issue 98, 2018, Page(s) 1-15, ISSN 0893-6080
DOI: 10.1016/j.neunet.2017.10.004

MSAFIS: an evolving fuzzy inference system

Author(s): José de Jesús Rubio, Abdelhamid Bouchachia
Published in: Soft Computing, Issue 21/9, 2017, Page(s) 2357-2366, ISSN 1432-7643
DOI: 10.1007/s00500-015-1946-4

Blockjoin

Author(s): Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Tilmann Rabl, Volker Markl
Published in: Proceedings of the VLDB Endowment, Issue 10/13, 2017, Page(s) 2061-2072, ISSN 2150-8097
DOI: 10.14778/3151106.3151110

Improving the efficiency of IRWLS SVMs using parallel Cholesky factorization

Author(s): Díaz Morales, R. , & Navia Vázquez, Á
Published in: Pattern Recognition Letters, Issue Volume 84, 1 December 2016, 2016, Page(s) 91-98, ISSN 0167-8655
DOI: 10.1016/j.patrec.2016.08.015

A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams

Author(s): Mohamad, S., Bouchachia, A. and Sayed-Mouchaweh, M.
Published in: IEEE Transactions on Neural Networks and Learning Systems, Issue N/A (early access), 2016, Page(s) 1-13, ISSN 2162-2388
DOI: 10.1109/TNNLS.2016.2614393

Model Selection in Online Learning for Times Series Forecasting

Author(s): Waqas Jamil, Abdelhamid Bouchachia
Published in: Advances in Computational Intelligence Systems - Contributions Presented at the 18th UK Workshop on Computational Intelligence, September 5-7, 2018, Nottingham, UK, Issue 840, 2019, Page(s) 83-95
DOI: 10.1007/978-3-319-97982-3_7

Fuzzy Classifiers

Author(s): Abdelhamid Bouchachia
Published in: Handbook on Computational Intelligence, Issue May 2016, 2016, Page(s) 185-207
DOI: 10.1142/9789814675017_0005

Advances in Computational Intelligence Systems - Contributions Presented at the 18th UK Workshop on Computational Intelligence, September 5-7, 2018, Nottingham, UK

Author(s): Ahmad Lotfi, Hamid Bouchachia, Alexander Gegov, Caroline Langensiepen, Martin McGinnity
Published in: Advances in Intelligent Systems and Computing, 2019
DOI: 10.1007/978-3-319-97982-3

ECML/PKDD 2017 Workshop on IoT Large Scale Learning from Data Streams

Author(s): M.S. Mouchaweh, A. Bifet, A. Bouchachia, J. Gama, R. Ribeiro
Published in: 2017

Apache Flink: Stream and Batch Processing in a Single Engine

Author(s): Paris Carbone, Stephan Ewen, Seif Haridi, Asterios Katsifodimos, Volker Markl, Kostas Tzoumas
Published in: Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, Issue December 2015 Vol. 38 No. 4, Issue on Next-Generation Stream Processing Systems, 2015, Page(s) 28-38