CORDIS - EU research results

A distributed data-mining software platform for extreme data across the compute continuum

Project description

Optimisation and improvement of extreme data mining

Data and extracted knowledge have become a crucial part of the digital transformation of many sectors. The process of extracting knowledge from data relies on data mining, which requires optimisation for efficient and specific data collection. However, data mining cannot cope with situations where data characteristics are extreme. This creates a developmental bottleneck, and solutions are needed to allow for efficient data mining across computing continuums that can handle extreme data characteristics. The EU-funded EXTRACT project offers a solution to this challenge. It will develop a data-driven open-source software platform that will utilise a vast array of computing technologies to guarantee safety, improved performance and energy efficiency while allowing for extreme data mining.


Data has become one of the most valuable assets, driving the digital transformation across many sectors. Current data mining solutions are optimized to deal with specific data requirements, but fail to cope as the data characteristics become extreme. There is therefore an urgent need for novel and holistic approaches to enable the development, deployment and efficient execution of data mining workflows across a heterogeneous, secure and energy-efficient compute continuum, while fulfilling the diverse extreme data characteristics. To fill this technological gap, EXTRACT will deliver a data-driven open-source software platform integrating the most relevant technologies, to facilitate the development of trustworthy, accurate, fair and green data mining workflows able to generate high-quality actionable knowledge. The EXTRACT platform will improve the complete lifecycle of extreme data mining workflows, significantly enhancing performance, energy-efficiency, scalability and security, while fulfilling the extreme data characteristics in a holistic way. Moreover, multiple computing technologies, from edge to cloud to HPC, will be integrated into a unified and secure compute continuum. Specifically, the platform will feature enhanced data infrastructures and AI & big-data frameworks, novel data-driven orchestration and distributed monitoring mechanisms, a unified continuum abstraction and cybersecurity and digital privacy across all software layers. The EXTRACT platform will be validated in two real-world use-cases with different extreme data requirements: 1) a Personalized Evacuation Route service, integrating data from the European data sources, Copernicus and Galileo, with 5G localization signals and smart city IoT sensors for civilian-centric crisis management; and 2) Transient Astrophysics with a SKA pathfinder, processing extreme data from 2000 radio-telescopes for the real-time assessment of solar activity, generating knowledge for further scientific exploitation.


Net EU contribution
€ 847 000,00
08034 Barcelona

See on map

Este Cataluña Barcelona
Activity type
Research Organisations
Total cost
€ 847 000,00

Participants (10)

Partners (1)