Skip to main content
European Commission logo print header

HARDWARE AND SOFTWARE TECHNIQUES TO IMPROVE MEMORY SYSTEM PERFORMANCE AND POWER ON MULTICORE ARCHITECTURES

Periodic Report Summary - SMART MEMORY SYSTEM (Hardware and software techniques to improve memory system performance and power on multicore architectures)

SMART MEMORY SYSTEM project intends to leverage Dr Ibrahim Hur's expertise within the areas of computer architecture and compiler design to develop novel hardware and software techniques to improve performance and power of memory systems in multicore architectures. In the past few decades, advances in silicon process technology have significantly reduced the size and switching times of transistors. As a result, both the number of transistors on a single die and clock rates of processors have increased rapidly, enabling processor performance to double almost every two years. However, these performance improvements have not resulted in comparable speedups for all applications. For instance, increasing processor performance by 50 % of a single processor of an IBM Power5+ system improves the performance of the industry standard SPEC CPU2006 benchmarks by only 13.1 % . Overall performance does not scale at comparable rates in all applications because the memory system performance has not kept pace with processor performance in modern systems. There are two aspects of the memory system performance: latency and bandwidth. While long latency and insufficient bandwidth limit the performance of modern systems, recently power consumption of the memory system has also received considerable attention.

As technological advances rapidly increase the number of transistors on a single die, for many years, designers utilised this ample chip area with increasingly complex core architectures and with large on-chip caches. Unfortunately, either of these have diminishing returns. Therefore, in the last eight years chips with multiple cores have started to emerge. Today, most microprocessors have two or more cores on a chip and this number is expected to increase in the future. Although multicore architectures seem to be promising for better performance, they have two major limitations. First, because some cores share the same channels to access memory, as the number of cores grows, the ability to fully utilise all cores diminishes due to increased memory demands. Second, multicore systems are not easily programmable, therefore they heavily rely on compiler support. This second issue is similar to the programming and compilation problem of parallel systems, but it is even more complicated now because of more sophisticated cores and memory hierarchies.

To address these issues, Dr Hur will primarily carry out research to develop low-cost and practical hardware techniques to enhance the memory controllers to improve memory bandwidth, latency, power consumption, and software techniques to improve locality. The project specifically intends to meet the following scientific and technical goals:

- creating a cycle-accurate simulation environment to design and evaluate hardware and software techniques for multicore systems;
- developing innovative memory prefetching, memory scheduling, and memory power management techniques for multicore architectures, and developing compiler transformations to improve locality in multicore systems;
- integrating and evaluating hardware and software methods in the context of the multicore cell broadband engine architecture.

The project achieved the following two main results in the first seven months of the project:

1. the models for the Power processor element (PPE) and a single Synergistic processing element (SPE) are implemented in the simulator;
2. functional and performance verifications of both the PPE and a single SPE are performed and performance errors are found to be in acceptable ranges.