Open source deep learning platform dedicated to Embedded hardware and Europe

Informacje na temat projektu

NEUROKIT2E

Identyfikator umowy o grant: 101112268

DOI

10.3030/101112268

Data podpisania przez KE 21 Maja 2023

Data rozpoczęcia 1 Czerwca 2023

Data zakończenia 31 Maja 2026

Finansowanie w ramach

Digital, Industry and Space

Koszt całkowity

€ 19 423 471,66

Wkład UE

€ 6 203 293,72

6 203 293,72

13 220 177,94

Koordynowany przez

COMMISSARIAT A L ENERGIE ATOMIQUE ET AUX ENERGIES ALTERNATIVES
France

Periodic Reporting for period 2 - NEUROKIT2E (Open source deep learning platform dedicated to Embedded hardware and Europe)

Okres sprawozdawczy: 2024-06-01 do 2025-05-31

A new trend emerging within the Artificial intelligence landscape is the convergence of AI and Internet of Things into "edge AI” involving machine learning algorithms on connected devices. This approach addresses the problems of cloud systems, such as high latency and lack of security, and reduces the carbon footprint.
Edge AI opens up new opportunities for AI applications and many domains are preparing for this revolution, including transportation, field service management, finance, healthcare, manufacturing, retail, agriculture, and supply chain. However, current AI development frameworks are not fully equipped for this paradigm shift.
NEUROKIT2E aims to provide an open-source and sovereign platform for Embedded AI with several ambitions:
1. Position EUROPE as a world leader with tools capable of meeting real-time, data confidentiality, energy consumption and usability requirements.
2. Provide a single end-to-end development platform that will integrate hardware models with AI models to optimize them for embedded devices.
3. Develop advanced compression, pruning and optimization methods to reduce model size while keeping the performance of the original network.
4. Enable the combined utilization of synchronous coding (tensors) and event-driven coding (spikes) in a single network.
5. Provide tools that allow the conversion and deployment of AI application on a large number of industrial or custom hardware architectures.
This platform will enable exploration and generation of standalone code that can be exported to embedded hardware devices. It will be compatible with existing frameworks and will enable performance analysis and comparisons using tools adapted to each hardware target.
The NEUROKIT2E consortium relies on 5 EU (France, Netherlands, Austria, Germany and Italy) with a balance between private and public research: 14 private companies including 5 large industries, 5 RTOs and 6 universities.

The first phase of the project laid the foundations for the study and maturation of new techniques for developing and porting neural networks to embedded hardware. The first stable version of the AIDGE framework was released in April 2024. Its core module manages fundamental operations, including the manipulation of graphs and the optimization of models. AIDGE’s version history reflects a progression to advanced capabilities. Early releases focused on network inference, while subsequent versions introduced training and quantization features, improved Python and C interface, and expanded support for hardware targets. The latest release, version 0.8.0 introduces new functionalities such as Spiking Neural Networks, Post-Training Quantization, Quantization-Aware Training and ONNX simplification.
As an open-source project hosted by the Eclipse Foundation, AIDGE fosters a collaborative environment benefiting developers and researchers in the embedded AI community and places Europe at the forefront of this competitive market.
The applicative part of the project is structured around seven use-cases in several fields such as Autonomous transport, Satellite observation, Healthcare and Smart building. Specifications have been drawn up, and generic and specific KPIs have been defined for all use cases. Although Work Package 6 (Implementation and demonstrations) began at the end of Period 1, the Use Case implementations will be finalized in Work Period 3. This is because the necessary developments essential for integrating its results, will only be completed by that stage. Period 2 has seen the development of building blocks for the Use Cases and preparation of the implementation of the results from the other work packages.
Quantization- and hardware-aware training methods have been investigated, for object detectors and transformer-based networks. Low-bit post-training quantization methods are also studied and have been tested on keyword retrieval and gesture recognition applications. Quantization to 4 bits showed very little performance loss (2%).
New teacher-student distillation methods, based on metaheuristics, gradient trajectory matching and dataset compression have been studied, which led to up to 95% model size reduction with maintained or improved performances, enabling smaller networks to be embedded on constrained hardware.
Pruning and tensor decomposition methods are also studied, to reduce the size and simplify the structure of a large network. Results on convolutional networks showed a significant reduction in model size with minimal accuracy loss. Memory savings were achieved due to the removal of weights. Inference times were reduced, though improvements varied depending on the layer configuration and level of pruning.
With the fast development of AI, Spiking Neural Networks have shown a lot of success due to their distinct data processing technique. SNN eliminate the need to perform complex multiplications due to binary nature of spikes. Compression techniques have been researched to improve their efficiency and were tested on an obstacle-detection system that processes LiDAR data. Although knowledge distillation and pruning led to higher compression, the performances dropped by more than 10%. But quantization from 32-bit to 8-bit enabled lower size reduction (typically 75%) while keeping very close performances (less than 2% loss).

In the second year, our project has made considerable progress, being often on the edge of the state-of-the-art in both methodology and practical impact.
AIDGE has proven to often surpass well-known existing general purpose frameworks. By integrating hardware-aware optimization with quantization and pruning, our methods enable fine-grained control over numerical representation and computation without sacrificing model accuracy. These optimization strategies reduce memory footprint and compute cost far beyond standard single-purpose libraries, while maintaining consistent performance on various devices.
The project introduced new hardware architectures designed to make AI systems smaller and more efficient. Instead of requiring massive amounts of energy, our models leverage different data representations (analog vs. digital) and processing methods (in memory computing), adaptive or distributed training strategies, and event-based computing, that enable very promising results. These methods reduce computational requirements and open the door to applications in domains where data is scarce, sensitive, or complicated to obtain.
Moreover, we have pushed beyond the current boundaries of explainability and trustworthiness. Our systems produce interpretable computational traces, offering clarity into how decisions are made and enabling domain experts to validate or correct outputs with far greater precision. This contrasts with black-box approaches and positions our work as a transformative step in the safety critical domain.
Our benchmarks demonstrate significant quantitative and qualitative results, showing on par or superior accuracy, faster inference, and more efficient performance across diverse models and devices. By making advanced DNNs more deployable, more efficient, and safer, the project sets new targets for AI efficiency beyond the state of the art.
More details on the project's website: https://neurokit2e.eu/

Periodic Reporting for period 2 - NEUROKIT2E (Open source deep learning platform dedicated to Embedded hardware and Europe)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz PDF Pobierz zawartość strony