Periodic Reporting for period 2 - NeuRAM3 (NEUral computing aRchitectures in Advanced Monolithic 3D-VLSI nano-technologies)
Reporting period: 2017-07-01 to 2019-06-30
The overall aim is to provide a European solution to deploy Artificial Intelligence (AI) techniques much closer to where the analysed data originates and to where necessary actions need to be executed. Today most of the AI is concentrated in the large data centers (the ""Cloud"") where both large amount of data can be stored and powerful High Performance Computing (HPC) tools can operate on them. However, this approach has several shortcomings for a number of applications. The first is the energy consumption necessary to transfer and store all the data, the second is its centralised nature that poses problems of security and confidentiality on the use of the data, the third is the fact that transmission times are not compatible with the need of real time action in some application. In order to overcome these issues, we propose to use smaller systems, closer to the data generation or even embedded with the sensors, that would be capable of extracting locally and in real time some information from the data and so streamline the analysis and decision process. Moreover we investigated the possibility of integrating also some degree of learning in these systems so that they can evolve adapting to the specificity of the environment where they are deployed.
To achieve this goal an interdisciplinary group of partners formed the consortium combining specialists of theory of computing, neuromorphic architecture design, nano-technology, and 3D VLSI integration. The idea was to co-develop novel devices, advanced fabrication technologies, circuit design, computing architectures and dedicated algorithms in order to achieve the best compromise possible between performance and power. In particular the objective is to have a scalable solution that can address both the needs of interfacing with low level sensors signals but also interconnected in multichip systems to achieve complex data analysis.
By the end of the project, the consortium has realised a number of chips which prove how Spiking Neural Networks can be successfully used in real life applications (like real time continuous ECG monitoring), how compact and easily deployable algorithms like variants of Reservoir Computing are sufficient for a number of cases, how this network can benefit from new technologies like FDSOI and RRAMs, how they can be scaled up by segmented buses using TFTs, how RRAM can be useful for both conventional digital Neural Networks and new mixed analog/digital architectures, and how a denser 3D technology including RRAMs is manufacturable for the future needs of circuit implementations.
1) an ultra low power, scalable and highly configurable neural architecture;
2) delivering a gain of a factor 50x in power consumption on selected applications compared to conventional digital solutions;
3) fabricated in Fully-Depleted Silicon on Insulator (FDSOI) at 28nm design rules, and in parallel
4) validating the modules to realise RRAM synapses both planar and in a 3D monolithic structures
5) and a TFT based scalable segmented bus system to interconnect multiple chips.
To achieve a significant in-silicon validation of the concepts and maintain the study of specific algorithms for RRAM co-integrated with CMOS, CEA-LETI included in one of its test reticles in 130nm CMOS+RRAM designs coming from the other partners in order to validate different concepts albeit at a smaller scale and smaller TRL. Concerning the 3D technology, CEA-LETI designed and included in one of its standard runs some extra processing and dedicated e-beam patterning to prove the full integration scheme and the electrical functionality of the 3D plus RRAM structures.
There were then multiple chips designed and fabricated. The most complex one is the Dynap-Sel chip, by UZH, that is a mixed analog/digital CMOS chip in FDSOI 28nm technology from ST capable also of performing on-chip learning. This chip has demonstrated the capability of running complex networks with a world record power efficiency of <2pJ per synaptic operation, better than more advanced technology chips by industrial players like Intel. An older version of this chip was used to test all new algorithmic concepts and a specifi case of Reservoir Computing was then implemented on Dynap-SEL. Another chip by UZH, ReASOn, in 130nm CMOS+RRAM validated the possibility of on-chip learning. The chip from CSIC validated the concept of CMOL crossbar and allowed an in depth study of 1T1R arrays. The SPIRIT chip from CEA demonstrated that a full CMOS+RRAM dedicated circuit with 10 neurons and 16k synapses can perform handwritten character recognition.
The spiking neural network concept pioneered in the project is now supported also by larger players like Intel but that has not achieved yet a comparable energy efficiency.
For ST the project has allowed the verification of the potentiality of RRAMs with FDSOI and will contribute to the further development of their offering on Phase Change Embedded Memories.
For CEA and Imec the project allowed the increase of their offer to their industrial partners, an addition to their process integration catalogue, and the maturation of the level of demonstration for AI applications.
For UZH and CSIC it has allowed to remain at the forefront of the research, design circuits and have them fabricated with technologies not commercially available and which will be exploited in the future research actions.
For CNR and IBM the project allowed the integration of materials they had previously developed in complex structures and so allowing their in-depth study and a better understanding of the requirements for the future developments.
For JacobsUni the project allowed the transition from pure mathematical models and algorithms to their implementation in real hardware and to the inclusion of some new effects into the theory to improve the overall efficiency.