CORDIS - Forschungsergebnisse der EU

Sustainable Performance for High-Performance Embedded Computing Systems

Periodic Reporting for period 4 - SuPerCom (Sustainable Performance for High-Performance Embedded Computing Systems)

Berichtszeitraum: 2022-12-01 bis 2023-11-30

SuPerCom addresses the challenge of providing "high and sustainable performance" (hsperf) covering the highest-ever computation performance needs of critical software with strong guarantees on sustainability for safe operation. To reach its goals, SuPerCom proposes a radical new approach by combining performance analysis, hardware design, and statistical and machine learning analysis. SuPerCom proposes innovative solutions that push the limits of current approaches for sustainable performance.

SuPerCom solutions can become an integral part of the ecosystem of next-generation embedded computing systems by allowing them to use increasingly-complex high-performance hardware features on which strong guarantees of sustainable performance can be placed soundly a priori. This will allow developers to use computer performance-demanding functionalities (with sound guarantees) such as complex algorithms to control CO2 emissions or to provide driving assistance in cars; or devices with increasing performance requirements in the medical market such as pacemakers or infusion pumps. Hence, the SuPerCom breakthrough can have a significant economic and societal impact.

1. SuPerCom provides sustainable performance with minimum impact on the performance of complex resources (e.g. accelerators) and with a small impact on overall hardware complexity.
2. For hard-to-predict resources SuPerCom shifts away from performance-capping solutions and instead enables high-performance features by adding novel hardware sensing techniques that implement Key Performance Indicators (KPIs).
3. SuPerCom leverages statistical analysis to manage the data coming from the proposed advanced KPIs and the hardware sensors that make them visible.
4. To enable incremental software verification, SuPerCom characterizes performance requirements for individual applications in isolation and develop an automatic framework that will produce benchmarks to create controlled load scenarios needed for application profiling.
5. SuPerCom introduces a hsperf in-field feedback-loop mechanism that maintains the sensing active during system operation to collect measurements for each system instance.
Requirements. We defined the main project requirements during the first months of the project: hardware support for predictability and observability. We also defined the main statistical and machine-learning techniques to be used in the project.

Case Studies/Benchmarks. We proposed a benchmarking approach for SoA autonomous driving platforms in accordance with structural design and functions of AD systems. In addition, we ported a space case study in an embedded GPU, showing the feasibility and effectiveness of existing space algorithm acceleration using GPUs.

Toolchain. We developed a baseline simulation infrastructure featuring state-of-the-art architectural support and industry-level accuracy.

Modelling. We developed timing models for crossbar interconnects resulting in tighter bounds. We also present better modeling approaches for the different parameters of a network on chip-interconnect. For buses, we propose an ILP formulation for computing the worst-case contention delay suffered by a task due to interference.

Analysis. We developed a technique to handle the variability in the values of hardware event monitors when running several times in the same experiment. For probabilistic WCET analysis, we show how survivability-analysis theory can help in producing tighter bounds. We also proposed a novel technique based on Markov’s Inequality for probabilistic WCET analysis that shifts away from previous approaches based on Extreme Value Theory (EVT). We showed the main gaps for the analysis of AD applications for their adoption in critical systems. This includes an analysis of the main sources of non-determinism. We showed how statistical analysis can be used to model the timing analysis of AD software. We showed how AD applications can be adapted to fully exploit the performance of the different computing elements in advanced hardware. We produced the first survey on the use of probabilistic worst-case analysis in the literature and also of multicore processors and GPUs. The statistical analysis used in the project allowed us to model other metrics of interest like worst-case energy consumption, power peaks, and hardware faults. Moreover, we developed a methodology used together with software randomization, a probabilistic WCET enabler, which allows computing the resource allocations in terms of memory and timing budget.

Characterization and Observability. We showed the main challenges for the characterization of complex AD applications to derive metrics like time and memory usage. We also showed how micro-benchmarks can be used to derive bounds to space applications in representative boards in that domain. At the hardware level, we dig down into some of the uncertainty coming from readings of hardware event monitors which can be subject to unexpected behavior and propose two methodologies to increase the confidence in their correct behavior.

Hardware Support. We show GPU configurations that are appropriate for automotive setups. We proposed several hardware techniques to track contention delay rather than events as a way to improve the accuracy in the contention cycles. We addressed the main memory, cache coherence, and GPUs. We propose a cache write policy that reconciles the benefits of high-performance and real-time policies. We proposed a performance monitoring unit for safety-critical systems.

The results listed above were published in peer-reviewed journals and conferences in the area of real-time systems. This includes conferences like IEEE RTSS (Real-Time System Symposium) and Euromicro ECRTS (Conference on Real-Time Systems). These works have been also disseminated in workshops and specific events organized by experts in the area. Some ideas are now being pursued in other projects with higher TRL. For the software technologies, we are in contact with companies in the area (e.g. avionics) as they have shown interest in the results obtained.
The techniques proposed for modeling, statistical/big-data analysis, hardware and software are novel: this includes the usage of risk analysis instead of survivability analysis and pattern-matching techniques for the modeling of parallel interconnects. Also the use of Markov’s Inequality for the first time for pWCET estimation.

Studies on how to use statistical analysis for AD software and the main sources of uncertainty of AD software. Also, another study on how AD applications can better exploit different compute elements to reduce WCET, how to exploit request sequences to exploit interconnect parallelism and how to exploit information about the resource allocation of GPU memory allocations, based on reverse engineering of these black-box components.

The analysis of the hardware event monitor predictability and the statistical approach to handle it. The hardware techniques to improve predictability or its modelling without affecting average performance. The formalization of the requirements and proposals for emerging observability of contention requirements on complex processors as an alternative to only modeling contention. The resource allocation approach for software randomized systems with probabilistic WCET. Last but not least, the use of neural networks for making more accurate predictions of contention delay.