Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

EXPeriment driven and user eXPerience oriented analytics for eXtremely Precise outcomes and decisions

Periodic Reporting for period 1 - ExtremeXP (EXPeriment driven and user eXPerience oriented analytics for eXtremely Precise outcomes and decisions)

Reporting period: 2023-01-01 to 2024-06-30

ExtremeXP’s envisions to provide accurate, precise, fit-for-purpose, and trustworthy data-driven insights via evaluating different complex analytics variants, considering end users’ preferences and feedback in an automated way. ExtremeXP is motivated by significant gaps identified in current state of the art in big data analytics and automated machine learning, which lead to underutilizing user expertise, insights and feedback. Several existing big data frameworks and architectures provide support of processing, analytics, ML, simulations, and visualizations for large data volumes, given the proper infrastructure; however, they mainly focus on efficiency and scalability in pre-designed data analytic workflows. Existing Automated ML (AutoML) frameworks streamline the whole process of providing optimised ML models from data ingestion and pre-processing to model selection and training, to model execution and result visualization; however, they do not involve end (business) users in the loop.
To address such limitations, ExtremeXP proposes a new paradigm for data analytics, which we call experimentation-driven analytics. The main contribution is that it puts the end user, i.e. requirements, preferences, constraints, interpretation, explanations, feedback, and decision making, at the centre of complex analytics processes (from data discovery to novel interactions), proposing a human-in-the-loop experimentation approach for gaining knowledge and making decisions from data with varying and extreme characteristics.
An experiment for ExtremeXP considers alternative workflow variants (considering datasets, features, algorithms, models, simulations, visualizations) in order to respond to a user intent (expressed via preferences or constraints), executes them, and evaluates them based on both system-level metrics (latency, accuracy, precision, specificity, anonymity) and feedback from the user in an automated or semi-automated way. ExtremeXP integrates interactive visualization and explainability techniques to increase the trustworthiness of the outcomes as well as the process followed to reach such outcomes. To achieve the above, ExtremeXP aims to produce the following results:
• Modelling framework and reference architecture for complex experiment-driven analytics.
• Experimentation engine for automating the scheduling, evaluation, and adaptation of complex analytics.
• Analysis-aware data integration concept and methods.
• Methods for Automated ML (AutoML) with user constraints.
• Support for user involvement in complex experiment-driven analytics.
• Explainability-oriented user interaction toolset.
• Interactive visualisation support including augmented reality and serious games.
• Holistic data and knowledge management supporting privacy and security.
• Five successful pilot demonstrators to validate ExtremeXP through deployment in relevant environments.
ExtremeXP will realize its goals via 5 use cases in critical domains such as crisis management, predictive maintenance, mobility, public safety and cyber-security. The project started by analyzing the requirements of the use cases, their datasets, user intents, needs for complex analytics and experimentation strategies. It proceeded with designing the modelling and architectural elements of the framework, in parallel with developing prototype tools for (i) designing and executing experiments, (ii) enabling complex analytics (e.g. for data selection, data integration, feature augmentation, constrained ML optimization) throughout the entire pipeline of an experiment, (iii) secure and trustworthy access control and data management, (iv) interactive user participation and feedback in the experimentation process, experiment aware visualizations and explainability. Upon that, focus is increasingly being given on the application of this new modeling approach to the use cases, and on the further, iterative development and enhancement of the first prototypes and their early integration in the upcoming ExtremeXP framework’s first release. In brief, ExtremeXP has:
• Developed the meta-model for the design and execution of experiments within the ExtremeXP framework. The metamodel is backed up by a Domain Specific Modelling Language (DSML), which allows the specification of the core concepts in ExtremeXP; i.e. user intents, constraints, workflows and tasks, metrics and variants (configurations) of an experiment.
• Designed the architectural blueprint of the ExtremeXP framework and, based on this, has implemented the first versions of the modules comprising the core framework. These include: (a) the experiment modelling component; (b) the experiment execution engine; (c) the module for capturing user intents, preferences and constraints, and mapping them to experiments.
• Developed a set of tools for the scalable data management for complex analytics, including: (a) a data selection module, which offers automated dataset selection strategies for the current analytics task; (b) an analysis aware data integration module, which offers query-driven and ML-driven functionality for data interlinking, based on user specified criteria; (c) a data augmentation toolkit that generates synthetic data to augment datasets and improve the overall accuracy of ML tasks in experiments.
• Developed ML algorithms that are aware of constraints, with the goal of enhancing the ML model’s performance under a specific user intent. Two types of constraints have been covered under two learning paradigms: supervised and unsupervised learning.
• Developed a framework for interactive visualization and explainability, offering functionality for: monitoring of experiment execution, interactive visualization of data, results and metrics and visual explanations; generating explanations about feature importance, experiment variants (e.g. hyperparameters used in an ML task) using several XAI methods.
• Developed a set of modules for context-aware access control and knowledge management allowing for the definition of specific access control policies along with the contextual attributes and handlers utilized to enforce them effectively within experiment design processes.
• Implemented all the aforementioned models, architectures, tools and frameworks by actively consulting its five use cases with respect to their experimentation needs.
ExtremeXP has achieved progress beyond the state of the art in various research fields relevant to the project’s objectives, including methods for dataset and feature selection, data integration, ML pipeline optimization, optimized visual exploration, etc. These results are reflected in publications in high rank venues (e.g. VLDB, ICDE, TKDE, EDBT, ISWC, DOLAP), demonstrating significant gains in efficiency, effectiveness and user experience, depending on the task. These works, alongside additional ongoing research works that are to be published during the second period of the project are expected to have high impact on the respective research fields. Further, the publication of the Domain Specific Modelling Language (DSML) for designing capture experiment workflows, their composition, and deployments is also expected to have high impact in the respective communities (e.g. pipeline optimization, MLOps, etc.).
ExtremeXP Framework
ExtremeXP Logo