Skip to main content

AdvanCed Hardware/Software Components for Integrated/Embedded Vision SystEms

Periodic Reporting for period 2 - ACHIEVE (AdvanCed Hardware/Software Components for Integrated/Embedded Vision SystEms)

Reporting period: 2019-10-01 to 2022-03-31

Applications like assisted and autonomous driving, unsupervised surveillance or robot vision, mandate real-time interpretation of the scene. This requires the incorporation of a higher level of intelligence at sensor level in order to extract the relevant information. The goal is to analyse the visual stimulus and to elaborate an adequate representation of the scene right at the sensor plane. This has very positive consequences for the power efficiency of the system.
Current trends in object recognition and classification rely on representation learning. Provided with a flexible internal structure and enough computing power, modern machine-learning systems has superseded the extraction of handcrafted features in favour of automatically discovering the most appropriate internal representation. This deep learning approach has revolutionized object recognition by dramatically reducing the error rate. The challenge today is to convey these processing capabilities to compact, lightweight and power-aware embedded vision systems and vision systems-on-a-chip. Moreover, in application scenarios like autonomous surveillance or intelligent transportation systems, these embedded vision systems will also be networked. A centralized processing is unpractical, it is necessary to develop a distributed system in which smart devices cooperate towards a collective goal.
The approach to take these challenges needs to be multidisciplinary. Efficient analysis and multilevel optimization techniques are required. ACHIEVE-ITN has trained a new generation of scientists through a research programme on highly integrated HW-SW components for the implementation of ultra-efficient embedded vision systems as the basis for innovative distributed vision applications.
The mjor achievements of our research program are:
- The development of integrated sensing and processing chips that combine image capture with on-chip acceleration of feature extraction/learning with a limited number of resources and under a restricted power budget
- The conception of a chip architecture that extracts 2D and 3D information at sensor level for an enriched description of the scene that can be shared and combined between groups of camera nodes working in cooperation
- The design of hardware accelerators that will permit the implementation of heavy-duty feature and representation learning and deep learning inference
- The conception of compact and efficient reconfigurable embedded vision systems, where local processing of visual information is combined with agile transmission of metadata and a careful power management
- The development of the cooperative vision algorithms that will operate on an enriched representation of the scene that can be locally shared by a set of nodes, allowing them to react collectively
- Discarding the concept of a central hub where all the data crunching is performed in favour of a scalable distributed processing system in which visual information drives dynamic adaptation and feedback to enhance users’ experience
- The introduction of scalable, easily deployable, always-on, visual monitoring methods that will be the basis for a new class of products and services
The main scientific objective of the research programme is the development of a distributed vision platform composed of networked, smart and efficient embedded vision systems. This platform will be the basic infrastructure for several application scenarios that demand cooperative vision based on the in-node processing of the visual stimulus and extraction of relevant information. This de-centralized scheme renders the system scalable, easily deployable and resilient to partial failure. The scientific progress can be briefly described from the outcomes of the different work packages:
• we have completed the design of two chips working on simplified descriptions of the scene based on time-encoding of information at the pixel level, which have been fabricated and then tested and validated.
• we have built an embedded vision system with a heterogeneous architecture that combines flexibility of FPGAs and robustness GPUs;
• we have advanced in the understanding of the scene elements and devised a method for the creation of models that can help to implement cooperative vision algorithms;
• we have provided an embedded implementation of partial SLAM which is compatible with the smart camera node architecture already mentioned;
• we are also provided a methodology for the online learning of the camera network topology for multiple-view tracking of people and vehicles;

In parallel with the research programme, a program of network-wide activities have been carried out to complement the PhD programs of our ESRs:
• we have designed personal career development plans for all students;
• we have triggered inter-disciplinary work and explore synergies between work packages;
• we have celebrated seminars on the sociological impact of surveillance technologies, open science, ethics in scinetific research and publication;
• we have organized courses on image sensors, embedded vision systems, cooperative smart data fusion, efficient CNN inference, exploitation of research results
• ESRs attended the summer schools and PhD forums organized by ACHIEVE-ITN partners;
• ESRs have received career guidance through panel sessions about the career beyond the PhD organised by ACHIEVE-ITN’s participants;
• 4 editions of WASC (the Collaborative Workshop on the Architecture of Smart Cameras) were celebrated, in Coimbra 2018, Rennes 2019, Ghent 2020 (online) and Barcelona 2022 (in person again)
This project aims to contribute to the enhancement of the career perspectives and employability of researchers, and to their skill development. Also, it contributes at European level to structuring doctoral/ESR training and to strengthening European innovation capacity. The expected impact can be measured in their increased set of skills; the increase in higher-impact research and innovation output; and a greater contribution to the knowledge-based economy and society.

The training programme of ACHIEVE-ITN has allowed the participating researchers to acquire the expertise needed to meet a growing demand on experts in embedded systems, computer vision and image sensors. One visible evidence is the relative facility with which most of ESRs have found a job in industry. This is also a contribution not only at the individual level, but also at the European level.

With respect to increasing the higher-impact research and innovation output, ACHIEVE-ITN participants have joined efforts to promote this transference-oriented research. The students have developed a clear picture of the possibilities of technology transfer and the benefits derived from the conversion of their own research results into product and services of high added value.

Finally, with respect to the contribution to knowledge-based economy and society, ACHIEVE-ITN is in a good position to start exploiting technologies that allow information-intensive real-time monitoring of human activity in scenarios that can have a huge economic impact. This has been paired with legal and ethical caveats that compensate for the profound interaction that a vision-enabled artificial intelligence will have in our lives .
WASC 2019 at INSA-Rennes
Kick-off meeting at CSIC facilities in Brussels