European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Cross-Layer Design of Adaptive, Reliable and Energy Efficient Systems

Final Report Summary - DARE (Cross-Layer Design of Adaptive, Reliable and Energy Efficient Systems)

As transistor sizes approach the atomic scale, circuits become more prone to permanent and transient faults, threatening the realization of more powerful and energy efficient devices in the near future. Currently, manufacturers go to great lengths to guarantee fault-free operation of their products by introducing redundancy in voltage margins, conservative layout rules, and extra protection circuitry. However, such measures limit the performance that can be achieved in each process technology generation and more importantly increase power consumption, which is another major design challenge. DARE seeks to turn the table around and prove that by revealing and exploiting the inherent error resilience and scalability of various applications then long lasting traditions that focused on ensuring 100% error free operation can be broken. Indeed, after 4 years of intensive research efforts, DARE showed that in many communication, multimedia, data mining and health monitoring applications, a paradigm shift towards a less than 100% accurate computation and storage is possible with substantial improvements in energy efficiency. To achieve this, mechanisms at various levels of design abstraction were developed that allow the system to adapt gracefully to dynamically changing operating conditions and user requirements, while capturing systematically the interplay between energy, reliability/yield, and quality. The proposed approach rather than trying to correct all errors in all computations/stored data, applies unequal error protection at various abstraction layers, giving priority to the protection of the most significant blocks/computations that carry most of the relevant signal information. Less significant parts are allowed to fail (partially), to provide erroneous results, or to be skipped/approximated as long as an acceptable quality of service is maintained.

During the four year reintegration period, the above methodology was refined and successfully applied to many more essential wireless communication algorithms as well as to health monitoring, multimedia and data mining applications. In particular, by utilizing the error injection framework developed in the first period of the project we successfully studied the behavior of different types of channel decoders under hardware induced failures. In addition, we developed low cost techniques for improving the error resilience of the studied communication blocks by taking advantage of their statistical characteristics. The novel Fast-Fourier-Transform that we introduced in the first period of the project was utilized within a quality scalable spectral analysis system of cardiac signals and an emerging award winning start-up company has explored its integration within one of their health monitoring products. Furthermore, by utilizing our framework we studied the behavior of such a health monitoring application under hardware errors and developed new low cost fault tolerant schemes for improving their energy efficiency.

In addition, we were successful in studying dynamic memories (DRAM), quantifying for the first time the related energy and reliability trade-offs by using data from our own fabricated chips. In such memories, each bit cell needs to be refreshed periodically for not losing the stored information through expensive refresh cycles that cost power and performance. Our studies show that it is possible to relax significantly the conventional refresh rates in embedded DRAMs by exploiting the error resilience of various applications that our research efforts have revealed. In fact, our latest experiments show that application resilience can help relax the refresh-rate to a point that we may not only save 2x power but we will also increase the memory availability to 95%! Such a finding will prove beneficial especially in the huge memory systems for storing the growing amount of data in the cloud data-centers.

Finally, building upon our observations we proposed generic error confining methods that take advantage of the statistical properties of applications and principles of numerical representations for reducing the overheads of traditional fault tolerant schemes. We showed that by focusing on confining the impact of errors to the output results rather that trying to detect and correct every single error, then up-to 83% read power, 77% read access time, and 89% area can be saved, for a variety of tested data-mining and multimedia applications!

All in all, our results obtained during of the project validate the potential of the proposed approach for a paradigm shift in the design of future electronic products in various embedded and high performance system domains that are the focus of the strategic European research agenda. New more energy efficient systems can be developed by avoiding the power hungry redundancy based techniques. By taking advantage of the error resilience of signal processing applications and wireless communications the yield loss can be maintained at very high levels thus the production costs can be reduced since the amount of chips that the manufacturers have to discard will be limited. In other words, the developed methods guarantee cheaper, more energy efficient and reliable systems to the European citizens and contribute to the European research objectives as described in the Horizon 2020.

The work conducted during the reintegration period appears in top-tier conferences and journals. The reintegration period allowed the Fellow to establish new prolific collaborations with well-known European Institutes, present his work in conferences around the world and participate in many of them as a member of the technical program committee. The remarkable results and the outstanding future potential of the proposed ideas have attracted the interest of many engineering schools in Europe as well as in U.S.A and led to the employment of the Fellow as tenure track professor in Queen’s University of Belfast , one of the leading research Institutes in United Kingdom. The Fellow has already established his own independent lab, consisting of 2 postdocs and 2 PhD students, while attracting undergraduate students to perform their final year Bachelor theses on areas related to the project. The developed website (https://sites.google.com/site/daresystemeu/) provide updated information about the project, while a Facebook and a Youtube video channel are used to communicate the main project ideas, results and potential impact to the general public.

Overall, the reintegration period has enhanced the Fellow’s expertise, academic profile, and track record allowing him to raise substantial funding for developing and supporting his independent lab, through which he can continue influencing the society and research community.