Final Report Summary - DPMP (Dependable Performance on Many-Thread Processors)
This project developed a cycle accounting architecture to track per-thread performance, which enables system software to deliver dependable performance by assigning hardware resources to threads depending on their relative progress. Through this cooperative hardware-software approach, this project addressed a fundamental problem in multi-threaded ad multi/many-core processing.
More specifically, we made several important contributions through this project. (1) We designed novel cycle accounting architectures, called criticality stacks and bottle graphs, to monitor per-thread performance in multi-threaded and managed language workloads. (2) We leverage per-thread progress to steer hardware/software cooperative scheduling and resource management to optimize (heterogeneous) multicore performance under bandwidth, power and reliability constraints. (3) To evaluate this idea, we developed Sniper, a parallel, hardware-validated, multi/many-core simulator that runs at a simulation speed up to 2 MIPS on current hardware. Its key feature is the ability to model core performance at a high level of abstraction using analytical models, which reduces both simulator development and evaluation time. The overarching (meta) conclusion from the project is that simple white-box analytical models are extremely powerful to comprehensively monitor workload execution characteristics, steer application scheduling and resource management, and devise powerful simulation infrastructures for increasingly complex multicore processor architectures.
This project has led to more than 60 publications in high-profile journals and conferences; a publicly released architecture simulator called Sniper that is now widely used in academia and industry (http://www.snipersim.org/); a spin-off called CoScale in datacenter monitoring (http://www.coscale.com/); and two related ERC Proof-of-Concept projects.