Descripción del proyecto
Computing Systems
Making modern embedded systems faster and less power hungry by parallelization
Writing parallel programs has traditionally been considered a difficult task, even when parallelism is taken into account from the beginning. Moreover there is an urgent need to parallelize the massive amounts of legacy sequential code so as to increase its performance on processors and systems that refocus from single-thread acceleration to increasing the overall throughput. At the same time, memory (in particular cache) performance is essential to achieve the full gain from a parallelized application. However, while processor architecture tends to be relatively standard across applications within a domain, huge performance and power improvements can be achieved by tailoring the cache architecture to the application at hand, and not just to an entire domain.
The HEAP project faces these challenges directly, by developing:
1. An innovative toolset that helps software developers profile and parallelize existing sequential implementations by exploiting top-level pipeline-style parallelism.
2. A highly configurable cache architecture that can be tailored to an application by using the same profiling data as those that were used for parallelization, in order to fully exploit the available computing power.
When compared with the existing single-cache coherency architectures and the existing, mainly manual, parallelizing approaches, the end-product of HEAP (i.e. the novel architecture combined with the innovative toolset) is expected to: a) reduce the time for parallelizing sequential applications by 20% b) reduce the energy consumed for the memory coherency operations by 20% and c) increase the performance of the memory coherency systems by 20%.
The HEAP framework directly addresses two distinct multi-billion application areas (a) High Performance Computing and (b) Multi-core Embedded Systems. In both fields it is expected that the impact of HEAP will be significant worldwide; this claim is supported by the fact that the HEAP results will be internally exploited by two of the largest semiconductor companies in the world (STM and Thales), as well as a large scale Information Systems Provider (Singular Logic) and an SME (Synelixis). Moreover, the commercial version of the toolset will be exploited by two additional software tool-providers (ACE and Compaan Design). Moreover, HEAP-based multi-core systems are expected to help closing the digital gap in Europe, while mainly the open-source version of the toolset will reinforce European competitiveness in the areas of Parallelizing toolsets and the new innovative platforms will help extending existing service offerings to the EU citizens.
Writing parallel programs has traditionally been considered a difficult task, even when parallelism is taken into account from the beginning. Moreover there is an urgent need to parallelize the massive amounts of legacy sequential code so as to increase its performance on processors and systems that refocus from single-thread acceleration to increasing the overall throughput. At the same time, memory (in particular cache) performance is essential to achieve the full gain from a parallelized application. However, while processor architecture tends to be relatively standard across applications within a domain, huge performance and power improvements can be achieved by tailoring the cache architecture to the application at hand, and not just to an entire domain.
The HEAP project faces these challenges directly, by developing:1.\tAn innovative toolset that helps software developers profile and parallelize existing sequential implementations by exploiting top-level pipeline-style parallelism.2.\tA highly configurable cache architecture that can be tailored to an application by using the same profiling data as those that were used for parallelization, in order to fully exploit the available computing power.
In particular, the HEAP project will provide1.\ta novel SMP multicore platform supporting a group of novel cache coherence protocols; each application will be profiled so as to select and tune the most appropriate cache coherency mechanism.2.\tan innovative toolflow that complements this architecture; this tool will ease and/or automate the parallelisation of sequential C-code based on an analysis of the dataflow while it will provide configuration and tuning data (e.g. in terms of which variables are local, and which are mostly written or mostly read by a thread) to the cache coherency mechanisms so as to optimize them for the given application
In order to increase the exploitability of the end-results, the toolflow (an incarnation of which will be also distributed in an open source manner) will be implemented in such a way that it will be able to perform sequential-to-multicore migration for any multicore architecture (not only the HEAP one). Moreover, the architecture will be capable of running multithreaded code compiled by any compiler/toolset (not only the one implemented by HEAP). However, in order to take full advantage of the HEAP results, the combined toolset and architecture should be utilized.
We innovate in the first domain by using both pessimistic and optimistic estimates of the available parallelism, by refining those estimates using metric-driven verification techniques, and by supporting dynamic recovery of excessively optimistic parallelization.
Ámbito científico
Tema(s)
Convocatoria de propuestas
FP7-ICT-2009-4
Consulte otros proyectos de esta convocatoria
Régimen de financiación
CP - Collaborative project (generic)Coordinador
20864 Agrate Brianza
Italia