Skip to main content

Heterogeneous Chip Multiprocessor Design

Final Report Summary - HTCMP (Heterogeneous chip multiprocessor design)

HTCMP aims at designing efficient and powerful heterogeneous chip multiprocessors (HTCMP). A challenging problem in the context of heterogeneous chip multiprocessor systems is the placement of processor cores and storage blocks within the available chip area. Focussing on such a heterogeneous chip multiprocessor, we address different design decision problems:

1. effective distribution of the available area among the processor cores and the memory blocks (cache)
2. memory hierarchy design
3. selection of number of processors and their types from the processor pool
4. thread and data distribution
5. advanced techniques such as three-dimensional (3D) designs.

Our main objective is to make significant contributions towards the development of compiler-based techniques for emerging and future HTCMPs. The software support for such systems is lagging way behind current advancements at the circuit and architecture levels. Effective compilation support for these architectures will make programming them much easier, thereby helping scientists to port their applications to these architectures. Outcomes of this research will be beneficial to the computer architecture field in that it will reveal the types of processor cores and memory components that are needed by the compiler for achieving the best application adaptation under dynamically changing power, performance and thermal conditions.

After the initial setup, we profiled the benchmarks and estimated their memory and processing requirements. Based on these requirements, we have implemented two major components of the project. In both components, after parallelisation and mapping, the input code is fed to a compiler analysis module. Purpose of this module is to identify the set of chip multiprocessor (CMP) nodes that communicate with each other. This information is subsequently passed to the solver which determines the location of each node within the NoC based CMP and the type of processor used for each node. Specific solvers we used in implementing this approach are:

1. a genetic algorithm (GA) based solver implemented using Java and
2. an integer linear programming (ILP) based solver implemented on a commercial tool.

During the second report period, we built over what we have done during the first report period. The two major contributions during the first report period were:

1. distribute the available area among the processor cores and the memory blocks and
2. processor selection.

Our contributions in the second period were:

1. thread and data distribution
2. communication reduction, and
3. advanced optimisations.

In the first part, we introduce an application-specific heterogeneous network-on-chip (NoC) design algorithm that considers the given constraints and generates a floorplan for the desired many-core. On the other hand, second part aims at minimising the communication costs of 3D NoC architectures. The third part proposes a reliability-aware 3D NoC design by reducing the inter-layer communications in a 3D NoC design.

The project resulted in three direct journal publications, one conference publication, one poster and seven indirect journal publications. Moreover, three M.S. students and one PhD student were supported.