Project description
Optimisation and improvement of extreme data mining
Data and extracted knowledge have become a crucial part of the digital transformation of many sectors. The process of extracting knowledge from data relies on data mining, which requires optimisation for efficient and specific data collection. However, data mining cannot cope with situations where data characteristics are extreme. This creates a developmental bottleneck, and solutions are needed to allow for efficient data mining across computing continuums that can handle extreme data characteristics. The EU-funded EXTRACT project offers a solution to this challenge. It will develop a data-driven open-source software platform that will utilise a vast array of computing technologies to guarantee safety, improved performance and energy efficiency while allowing for extreme data mining.
Objective
Data has become one of the most valuable assets, driving the digital transformation across many sectors. Current data mining solutions are optimized to deal with specific data requirements, but fail to cope as the data characteristics become extreme. There is therefore an urgent need for novel and holistic approaches to enable the development, deployment and efficient execution of data mining workflows across a heterogeneous, secure and energy-efficient compute continuum, while fulfilling the diverse extreme data characteristics. To fill this technological gap, EXTRACT will deliver a data-driven open-source software platform integrating the most relevant technologies, to facilitate the development of trustworthy, accurate, fair and green data mining workflows able to generate high-quality actionable knowledge. The EXTRACT platform will improve the complete lifecycle of extreme data mining workflows, significantly enhancing performance, energy-efficiency, scalability and security, while fulfilling the extreme data characteristics in a holistic way. Moreover, multiple computing technologies, from edge to cloud to HPC, will be integrated into a unified and secure compute continuum. Specifically, the platform will feature enhanced data infrastructures and AI & big-data frameworks, novel data-driven orchestration and distributed monitoring mechanisms, a unified continuum abstraction and cybersecurity and digital privacy across all software layers. The EXTRACT platform will be validated in two real-world use-cases with different extreme data requirements: 1) a Personalized Evacuation Route service, integrating data from the European data sources, Copernicus and Galileo, with 5G localization signals and smart city IoT sensors for civilian-centric crisis management; and 2) Transient Astrophysics with a SKA pathfinder, processing extreme data from 2000 radio-telescopes for the real-time assessment of solar activity, generating knowledge for further scientific exploitation.
Fields of science
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
- engineering and technologycivil engineeringurban engineeringsmart cities
- engineering and technologyelectrical engineering, electronic engineering, information engineeringinformation engineeringtelecommunicationstelecommunications networksmobile network5G
- natural sciencescomputer and information sciencesinternetinternet of things
- natural sciencescomputer and information sciencesdata sciencedata mining
- engineering and technologyelectrical engineering, electronic engineering, information engineeringelectronic engineeringsensors
Keywords
Programme(s)
Funding Scheme
HORIZON-RIA - HORIZON Research and Innovation ActionsCoordinator
08034 Barcelona
Spain