Periodic Reporting for period 1 - RELAX (Relaxed Semantics Across the Data Analytics Stack)
Reporting period: 2023-03-01 to 2025-02-28
For graph data, we focused our research on the Single-Source Shortest Path (SSSP) problem at the beginning, a fundamental problem in graph analysis algorithms. DC12 has shown that it is possible by relaxing the synchronous model execution to achieve better parallelism-induced redundant work and efficient parallelism. The proposed algorithm achieves competitive or better performance compared to the state of the art. Combining results from this research and another DC project we formed the RELAXed Traversers team that competed and won the best solution award in the FastCode Programming Challenge for the fastest Single Source Shortest Path (SSSP) problem. This challenge was hosted by the 30th ACM Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2025). Our solution "Relax and don’t Stop: Graph-aware Asynchronous SSSP" solution managed to process 150.89 million edges per second, significantly outperforming the second-best solution, which processed 52.56 million edges per second.
For NNs we had several results that show the power of relaxation on efficiency. DC7 has presented a stochastic weight-sharing quantisation technique specifically tailored to Bayesian NNs (BNN) that can significantly reduce (relax) the effective number of parameters of a BNN while obtaining results on par with state-of-the-art in large datasets and architectures. Complementary to this result, DC6 has found methods to analyse the behaviour of NNs under small changes in inputs, parameters, and activation values when using compression techniques such as quantization and pruning for an infinite family of quantisation schemes. For DNNs DC2 has shown that by relaxing the synchronization requirements during the parallel training phase in a controllable way one can achieve faster training without losing accuracy.
For data streams among other we investigated data summarization and data compression and explainable AI (XAI). Data summarization has emerged as a useful technique for extracting insights from massive data sets into compact synopses structures, while typically requiring much less space and computation. DC1 has proposed a novel concurrent data structure for finding heavy hitters, a fundamental operator on data analysis. The parallel method proposed maintains significant higher throughput compared to state-of-the-art methods while supporting higher accuracy. DC9 has developed a feature importance method that finds the model’s important features for the task of discriminating a pair of classes. This technique can be effectively computed in a streaming scenario. Feature importance is one of the most popular techniques in XAI.
In addition to the scientific and research outcome, by the end of the project the RELAX European Doctoral Network is expected to produce a cohort of 12 highly mobile and adaptable researchers, experts in the design of scalable and efficient data-intensive software systems addressing a critical skills gap in data analytics expertise, needed to support innovation and employment in a fast-growing European data economy.