Skip to main content

Machine Learning-Based Test Solutions for Reliable Mixed-Signal/RF Integrated Devices

Final Report Summary - MLTSRMSRFID (Machine Learning-Based Test Solutions for Reliable Mixed-Signal/RF Integrated Devices)

Project context and objectives

The advantages in digital computing have become a key factor for the increased pervasiveness of analogue integrated circuits (ICs) in modern systems. Analogue circuits play the role of sensors and actuators that interface the real-world continuous signals with the digital processors. In the automobile and avionics, for example, analogue circuits are used for computerised engine control, safety and navigation aids. Another vast field of application is the telecommunications where analogue circuits constitute the essential part of the wireless transceiver.

The development of analogue circuits is driven nowadays in terms of better performance for existing and emerging applications. Furthermore, the trend is towards integrating analogue circuits together with their digital counterparts onto the same silicon substrate, thus achieving high-levels of miniaturisation, low power dissipation, and increased design flexibility. However, these often phenomenal achievements are obtained at the expense of higher variability in the performance of analogue circuits. This means that the yield of fabrication is reduced and that malfunctions during the lifetime of the analogue circuits are intensified. These problems call for test strategies that can reliably detect and diagnose failing devices, as well as built-in self-test (BIST) strategies to allow safety-critical analogue circuits to examine their own functional health during their lifetime, in order to detect and report faults that may jeopardise the reliability of the application wherein they are deployed.

This project aims to leverage the power of machine learning to develop new test strategies for analogue ICs. The long-term objective is to facilitate the realisation of testable and reliable analogue ICs, thus enabling reliable computing and fostering technology trustworthiness.

Machine learning is used to address the following emerging and open-ended test challenges:

1. In theory, analogue ICs can be thoroughly tested in a high-volume production setting. However, this comes at a prohibitively high cost that can escalate very fast to the point where the benefits from introducing a new technology are quickly wiped out. Testing typically requires verifying explicitly the compliance of a large variety of specifications that determine the correct operation of the analogue IC. This involves lengthy test times and, in addition, necessitates the use of sophisticated and expensive test instrumentation. The high levels of integration compound this challenge since they hinder the application of tests and the readout of responses. In general, the aim is to simplify and standardise the test infrastructure and to decrease test time. To this end, machine learning is used to infer the outcome of the aforementioned test procedure from a simple test measurement pattern. The underlying idea is that this measurement pattern tracks variations, thus machine learning can be used to uncover its intricate correlation to the actual performances of the analogue IC.

2. A comprehensive fault diagnosis scheme is needed to:

(a) understand the sources of failure in the analogue IC prototypes, in order to meet the time-to-market goal;
(b) gather information regarding the underlying failure mechanisms, in order to enhance yield for future IC generations; and
(c) understand the root-cause of failure in analogue ICs that are part of a larger safety-critical system, in order to repair the system if possible, gain insight about environmental conditions that can jeopardise the system's health, and apply corrective actions that will prevent failure reoccurrence and, thereby, expand the safety features.

To this end, machine learning is used to mine into large data sets aiming at de-embedding the faulty components of analogue ICs and resolving fault ambiguity (faults which have the same influence on the IC behaviour).

3. Incorporating BIST capabilities in an analogue IC may simplify significantly the task of detecting manufacturing defects. However, the key advantage of BIST is that it can also be performed on-line in the field of operation and, thus, target malfunctions that occur during the lifetime of the analogue IC, instigated by environmental impact and wear. Therefore, BIST is vital to analogue ICs deployed in safety-critical applications, sensitive environments, and remote-controlled systems. To this end, the project aims to develop on-chip learning techniques that employ simple built-in structures to predict the degradation of the analogue IC.

The achievements of the project with regards to the three research axes outlined above are the following:

1. We developed an adaptive machine-learning-based testing paradigm that maps an alternative, low-cost measurement pattern to one out of three possible decisions: direct pass (in which case the performances are also predicted), low confidence, direct fail. The intermediate decision implies that a test decision based solely on the alternative measurement pattern is prone to error and suggests that further action is taken. By incorporating this intermediate level, we render the proposed low-cost test paradigm equivalent to the costly standard test approach in terms of resultant test errors. This work won the best paper award at the 2009 IEEE European Test Symposium, which is one of the major international conferences in the field of IC testing.

2. Failure analysis (FA) of defective ICs is traditionally performed using light-emission, laser probing, picosecond imaging, etc. All these methods consist of observing failures by their optical characteristics. However, with the increasing reduction in feature sizes and the high complexity of modern ICs, the time and the cost required for applying these methods has become intolerable. To this end, we developed a methodology to facilitate diagnosis of local spot defects in analogue ICs based on machine learning. The methodology is capable of determining the root cause of failure and guides appropriately the aforementioned classical FA methods, thus reducing the required time-to-diagnose. We demonstrated the method on a transceiver designed by NXP Semiconductors by taking into consideration the realities of an industrial, large-scale case study. Diagnosis of failed parts is very important since this particular device is used in safety-critical automobile systems.

3. We designed and fabricated for the first time sensors that monitor high-speed analogue circuits while being transparent to them. This non-intrusive property is a key characteristic because the sensors can be designed independently of the high-speed circuit and, most importantly, they do not degrade its performance. One type of sensors are dummy structures and process control monitors that sense similar process variations with the high-speed circuit by virtue of being in close proximity to it on the die. The operation of these sensors capitalises on the slow-varying and smooth spatial variations that inevitably exist across the die. A second type of sensor is a temperature sensor that is also placed in close proximity to the high-speed circuit. The temperature sensor tracks shifts of temperature in the vicinity of the high-speed circuit which are caused by a shift in its power consumption which in turn implies the presence of a defect. Furthermore, we designed and fabricated a neural network which obtains as inputs the outputs of the sensors and provides a one-bit response indicating whether the analogue IC is functional or faulty. The neural network is fully configurable with 10 neurons and 100 synapses, which allows us to study various topologies that can implement classification boundaries of increased non-linearity. This work was carried out in collaboration with NXP Semiconductors, Netherlands and Yale University, United States of America (USA).