CORDIS - Forschungsergebnisse der EU
CORDIS

SMARTHEP: Synergies between Machine leArning, Real Time analysis and Hybrid architectures for efficient Event Processing and decision making

Periodic Reporting for period 1 - SMARTHEP (SMARTHEP: Synergies between Machine leArning, Real Time analysis and Hybrid architectures for efficient Event Processing and decision making)

Berichtszeitraum: 2021-10-01 bis 2023-09-30

The analysis of data plays a huge role in decision-making for many industries and continual advances means that in order to stay competitive, many organisations rely on using data to make more informed decisions about their customers, competitors, products and services. Organisations leveraging data available to them can be the difference between them keeping up with their competitors and remaining relevant to their customers.

The volume of data available to research and industry is increasing at an exponential rate. The increase in data collection is not always matched by comparable data storage, utilisation and analysis capabilities. This means that most data produced is either discarded, not recorded or recorded and stored without being analysed.

High Energy Physics (HEP) experiments have the ability to produce hundreds of gigabytes of data per second. Current resources and the time taken to make decisions about the data are not scaled to adequately process and utilise this data. In order to make the most of the data in a cost-effective way, data-taking and data-analysis needs to become more efficient. The training of a new generation of researchers to work towards Real-Time Analysis is part of the solution needed to deliver this paradigm shift.

Synergies between Machine learning, Real-Time analysis and Hybrid architectures for efficient Event Processing and decision making (SMARTHEP) is a European Training Network (ETN) with the aim of training a new generation of Early Stage Researchers (ESRs) to use real-time decision-making effectively leading to data-collection and analysis becoming synonymous.

SMARTHEP brings together scientists from the four major collaborations which have been driving the development of Real-Time analysis (RTA) and key specialists from computer science and industry. By solving concrete problems as a community, SMARTHEP will bring forward a more widespread use of RTA techniques, enabling future HEP discoveries and generating large-scale impact to industry.In addition ESRs will contribute to European growth exploiting their hands-on work to produce concrete commercial deliverables in fields that can most profit from RTA, such as transport, manufacturing, and finance.
The SMARTHEP project comprises 12 Early Stage Researchers (ESRs) from 10 countries who began their PhDs in October 2022, selected from 199 applicants representing 55 nationalities. Working across the four main LHC experiments (ALICE, ATLAS, CMS, and LHCb), their focus is on real-time analysis systems, known as "trigger systems." These systems perform a first-pass analyses on massive data delivered by the LHC in milliseconds, selecting data for further analysis, ultimately contributing to fundamental particle measurements and new particle searches.

The goal of the SMARTHEP projects is to further accelerate decision-making using machine learning and hybrid computing architecture, and significantly enhance the available datasets with data that is collected and recorded in real-time prior to the decision on whether to keep or discard it. The main challenge is to ensure that this data is of high quality, despite the constrained environment in which its first-pass analysis has been performed. Since these goals are common to industry and science, cross-talk between academia and industry in terms of datasets and methods is an integral part of the PhD projects.

The work of the ESRs students aligns with the following aspects, mapping to Work Packages in the SMARTHEP project:

Machine Learning (WP3): In their first year, ESRs focused on ML, with dedicated training and significant progress. They developed ML algorithms for various applications in LHC experiments, including data compression, detector modeling, and track reconstruction. A network-wide effort from the ESRs produced a white paper on the state of the art of ML for real-time analysis at the LHC.

Hybrid Computing Architectures (WP4): ESRs in their first year have employed hybrid computing techniques, particularly in ML frameworks, but also to develop specific solutions for present and future trigger systems. A collaborative white paper on the usage and importance of hybrid architectures for real-time LHC analysis enhanced their understanding of these systems.

Real-time Decision Making (WP5): Most ESRs concentrated on real-time decision-making aspects, contributing to trigger algorithms and working on real-time streams. Industry collaborations, especially in finance, started in the first year. The ESRs also wrote a white paper summarised trigger schemes at the LHC.

ESRs actively participated in internal meetings, workshops, and international conferences, presenting their work, including at CHEP 2023, the biggest computing conference in High Energy Physics. One publication, with significant student input, is prepared for submission to the Journal of Instrumentation.
The ESRs commenced their PhD projects coinciding with the restart of the LHC's Run-3 data taking. They significantly advanced trigger systems, enabling novel triggerless readouts and real-time data streams for all LHC experiments. A publication summarizing this work in the ATLAS experiment has been submitted to JINST, with further publications for other experiments planned.

ESRs developed ML algorithms for efficient tracking, reconstruction, calibration, and identification of particles, and data compression algorithms (as a cross-field open source software). Industrial contributions include algorithms for computer vision in driving assistance and financial time series analysis. Hybrid computing highlights involve a CPU+GPU track reconstruction prototype for ATLAS, GPU-accelerated trigger pipelines in LHCb, and code porting to frameworks including accelerators.

ESRs will continue enhancing LHC trigger systems and utilizing improvements for physics analysis. In the RTA for monitoring and discoveries Work Package, they will focus on implementing ML algorithms in Field Programmable Arrays (FPGAs) during their secondments, targeting the 2029 data taking period. Outlier detection algorithms for science and industry will be developed, anticipating significant impact, publications, and dissemination in the next reporting period. A new line of research on energy-efficient ML algorithms is also showing promising results, with potential broad scientific and societal impacts.
SMARTHEP ESRs, PIs and External Advisory Board visit the European Spallation Source in Lund during t
SMARTHEP ESR Sofia Cella (official ATLAS guide) guides the ESRs and PIs underground in 2022
One of Early Stage Researchers, Jamie Gooding, presenting the SMARTHEP network at CHEP2023.
SMARTHEP kick-off in Manchester (UK) in 11/2022, with a visit to the Jodrell Bank Observatory