Skip to main content

Sustainable Data Lakes for Extreme-Scale Analytics

Deliverables

Initial version of the HIN mining engine

Prototype implementation of the HIN mining library, including functionalities for similarity search and browsing, entity resolution and entity ranking.

Initial version of the visual analytics engine

First prototype of the visual analytics layer, including basic functionalities for interactive visual analytics over spatial, temporal and network data.

Query engine over virtualized data

Includes the query planning and execution operators for natively supporting queries over virtualized, heterogeneous data.

Interactive visual analytics model

Model specification driving the interactive visualizations for HIN exploration, analysis and mining, including the visual interfaces and interactions for feature space exploration, model selection and parameter tuning.

System architecture

The initial architecture of the SmartDataLake platform, its individual components, and their interfaces.

Similarity search, entity resolution and ranking

Includes the attribute-based and link-based similarity measures, techniques and algorithms for search and browsing over multi-typed entities and relations, as well as the algorithms for entity resolution and ranking.

Data synopses for approximate analytics

Algorithms for approximate query answering and analytics based on adaptive data synopses.

Publications

A Parallel and Distributed Approach for Diversified Top-k Best Region Search

Author(s): Hamid Shahrivari; Matthaios Olma; Odysseas Papapetrou; Dimitrios Skoutas; Anastasia Ailamaki
Published in: Proceedings of the 23rd International Conference on Extending Database Technology (EDBT), 2020, Page(s) 265-276
DOI: 10.5441/002/edbt.2020.24

Local Similarity Search on Geolocated Time Series Using Hybrid Indexing

Author(s): Georgios Chatzigeorgakidis, Dimitrios Skoutas, Kostas Patroumpas, Themis Palpanas, Spiros Athanasiou, Spiros Skiadopoulos
Published in: Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2019, Page(s) 179-188
DOI: 10.1145/3347146.3359349

Scalable temporal clique enumeration

Author(s): Kaijie Zhu, George Fletcher, Nikolay Yakovets, Odysseas Papapetrou, Yuqing Wu
Published in: Proceedings of the 16th International Symposium on Spatial and Temporal Databases, 2019, Page(s) 120-129
DOI: 10.1145/3340964.3340987

Local Pair and Bundle Discovery over Co-Evolving Time Series

Author(s): Georgios Chatzigeorgakidis, Dimitrios Skoutas, Kostas Patroumpas, Themis Palpanas, Spiros Athanasiou, Spiros Skiadopoulos
Published in: Proceedings of the 16th International Symposium on Spatial and Temporal Databases, 2019, Page(s) 160-169
DOI: 10.1145/3340964.3340982

Automatic Clustering by Detecting Significant Density Dips in Multiple Dimensions

Author(s): Pantelis Chronis, Spiros Athanasiou, Spiros Skiadopoulos
Published in: 2019 IEEE International Conference on Data Mining (ICDM), 2019, Page(s) 91-100
DOI: 10.1109/icdm.2019.00019

Taster: Self-Tuning, Elastic and Online Approximate Query Processing

Author(s): Matthaios Olma, Odysseas Papapetrou, Raja Appuswamy, Anastasia Ailamaki
Published in: 2019 IEEE 35th International Conference on Data Engineering (ICDE), 2019, Page(s) 482-493
DOI: 10.1109/icde.2019.00050

GPU-accelerated data management under the test of time

Author(s): Aunn Raza; Periklis Chrysogelos; Panagiotis Sioulas; Vladimir Indjic; Angelos Christos Anadiotis; Anastasia Ailamaki
Published in: Conference on Innovative Data Systems Research (CIDR), Issue 1, 2020
DOI: 10.5281/zenodo.3827490

Similarity search over enriched geospatial data

Author(s): Kostas Patroumpas, Dimitrios Skoutas
Published in: Proceedings of the Sixth International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data, 2020, Page(s) 1-6
DOI: 10.1145/3403896.3403967

JedAI3: beyond batch, blocking-based Entity Resolution

Author(s): George Papadakis; Leonidas Tsekouras; Manos Thanos; Nikiforos Pittaras; Giovanni Simonini; Dimitrios Skoutas; Paul Isaris; George Giannakopoulos; Themis Palpanas; Manolis Koubarakis
Published in: Proceedings of the 23rd International Conference on Extending Database Technology (EDBT), 2020, Page(s) 603-606
DOI: 10.5441/002/edbt.2020.74

Adaptive HTAP through Elastic Resource Scheduling

Author(s): Aunn Raza, Periklis Chrysogelos, Angelos Christos Anadiotis, Anastasia Ailamaki
Published in: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020, Page(s) 2043-2054
DOI: 10.1145/3318464.3389783

explAIner: A Visual Analytics Framework for Interactive and Explainable Machine Learning

Author(s): Thilo Spinner, Udo Schlegel, Hanna Schafer, Mennatallah El-Assady
Published in: IEEE Transactions on Visualization and Computer Graphics, 2019, Page(s) 1-1, ISSN 1077-2626
DOI: 10.1109/TVCG.2019.2934629

Visual Exploration of Geolocated Time Series with Hybrid Indexing

Author(s): Georgios Chatzigeorgakidis, Kostas Patroumpas, Dimitrios Skoutas, Spiros Athanasiou, Spiros Skiadopoulos
Published in: Big Data Research, Issue 15, 2019, Page(s) 12-28, ISSN 2214-5796
DOI: 10.1016/j.bdr.2019.02.001

Uncertainty-Aware Principal Component Analysis

Author(s): Jochen Gortler, Thilo Spinner, Dirk Streeb, Daniel Weiskopf, Oliver Deussen
Published in: IEEE Transactions on Visualization and Computer Graphics, Issue 26/1, 2020, Page(s) 822-831, ISSN 1077-2626
DOI: 10.1109/tvcg.2019.2934812

Blocking and Filtering Techniques for Entity Resolution

Author(s): George Papadakis, Dimitrios Skoutas, Emmanouil Thanos, Themis Palpanas
Published in: ACM Computing Surveys, Issue 53/2, 2020, Page(s) 1-42, ISSN 0360-0300
DOI: 10.1145/3377455

v‐plots: Designing Hybrid Charts for the Comparative Analysis of Data Distributions

Author(s): Michael Blumenschein, Luka J. Debbeler, Nadine C. Lages, Britta Renner, Daniel A. Keim, Mennatallah El‐Assady
Published in: Computer Graphics Forum, Issue 39/3, 2020, Page(s) 565-577, ISSN 0167-7055
DOI: 10.1111/cgf.14002

Datasets

Wikidata Companies Graph

Author(s): Pantelis Chronis
Published in: Zenodo

OSM Businesses & Organizations

Author(s): Pantelis Chronis
Published in: Zenodo

CorpWatch Companies Graph

Author(s): Pantelis Chronis
Published in: Zenodo

GDELT Articles Graph

Author(s): Pantelis Chronis
Published in: Zenodo

DBLP Publications Network

Author(s): Pantelis Chronis
Published in: Zenodo