Skip to main content

GPU-WEAR, Ultra-low power heterogeneous Graphics Processing Units for Wearable/IoT devices

Periodic Reporting for period 2 - GPU-WEAR (GPU-WEAR, Ultra-low power heterogeneous Graphics Processing Units for Wearable/IoT devices)

Reporting period: 2017-06-01 to 2018-05-31

End users keep demanding better wearable and IoT product experiences. These depend on extended battery lifetime, higher performance, and quality of the visual experience. Over 64% of today’s end-users are not satisfied with their mobile-device battery lifetime and 30% of the first generation smartwatch buyers returned their watch because the battery lifetime was limited to 25 hours of regular use per day. High performance and vibrant display technology in sensor-based mobile and data-acquisition devices come at the expense of a short battery lifetime. The majority of todays’ available devices and their embedded SoCs are not optimized to meet user expectations in terms of power consumption.
The GPU-WEAR concept is based on a holistic approach (HW, SW, API, and compiler level) to reduce power consumption in wearable/IoT devices and enable developers to deliver power efficient applications for embedded GPUs. The new family of heterogeneous, multicore GPUs is driven by the graphics characteristics of wearable/IoT devices and displays. Both core types of NEMA heterogeneous GPUs are powered by a single and “morphable” low-power green ISA, thus a single executable and a single software/compiler toolchain. Moreover, in GPU-WEAR project we are developing Display-aware and Content-aware graphics technology.
Display-aware graphics operations aim to exploit the relation between target screen properties (display size/resolution and display type technology) and accuracy of the graphics operations (precision of computations and storage data types), thus the goal is to reduce power without adversely affecting the visual quality. For example, it is obvious that there are different computation needs between a system equipped with a 1080x1920, RBG888 display (typical smartphone) vs. a system with 300x300, RGB333 (a typical smartwatch). Context-aware technology targets to exploit the inherent imbalance in the graphics workloads among different applications. For example, in an Android smartphone, different graphics computational needs are required when someone browses photos, plays “Angry Birds-like” games, or just looks at the clock App. This imbalance in the graphics workload can be inherently exploited by a heterogeneous GPU leading to significant power savings.
The project has now been running for 2 years and all the development groups of Think Silicon have been working hard and collaborating together in addressing the project from different perspectives; from VLSI and architectural-level to graphics APIs/applications and compiler optimizations.
An FPGA prototype of the heterogeneous, multicore NEMA GPU has been built and it is currently being optimized for various graphics workloads. The compiler toolchain of this new architecture is based on LLVM open framework. In addition, based on the feedback that we received from the marketing and sales teams of the company, we decided to develop and release another version of NEMA GPU called NEMA|p ( NEMA|p is an ultra-low-power and ultra-low gate count 2D GPU. As part of the project, an innovative z-buffer compression technique, a new rasterization unit (design using the HLS approach), and an ultra low gate count vertex processor were designed and integrated in NEMA GPUs. Moreover, each hardware component of NEMA is equipped with smart VLSI-level power savings techniques. A value memorization unit to perform value reuses in an approximate way is built for NEMA and integrated in LLVM toolchain.
On the software/API level, the company developed NEMA|GFX-API library that interfaces directly with the NEMA GPUs. NEMA|GFX-API has an exceptionally small memory footprint and it is an ideal solution for devices with limited memory resources. On top of this API, we developed GPU-WEAR-LIB, that is an OpenGL ES 2 driver for Linux and Android systems, which includes specific annotations for static (compiler-level) and dynamic (run-time) QoS management of graphics workloads in order to get the full benefits of the underlying heterogeneous GPU.
It is important to note that all the SW and HW components are being validated by a continuous integration framework (CI) that includes unit, smoke, regression, performance, system integration, and stability tests (developed as part of the project). At the hardware level, the said framework utilizes smart code coverage and UVM-based verification techniques.
On the tool side, 5 GPU-WEAR products were released to the open community:
- NEMA|PIX-Presso (
- NEMA|Bits (
- NEMA|SHADER-Edit ( and
- NEMA|GUI-Builder (
The released material includes both FPGA bitstreams (HW) and SW analysis/development tools.
It is worthwhile to mention that during the lifetime of the project, the company entered 3 license agreements, a Marketing License Agreement to co-develop an IoT/Wearables platform with Synopsys, a strategic partnership with SiFive, and 5 evaluation licenses with various MCU (including Tier-1) companies. The NEMA GPUs (being developed as part of the project) were also included in the prestigious Microprocessor report issued by Linley group. Finally, 3 patents have been granted and 4 patents have been submitted to USPTO.
The world market for wearable technologies is continuously experiencing high growth, which is supported by a significant demand for wearable applications. In 2014, 36% of wearable devices feature a display, while 9% had a built-in camera. The device market featuring displays and built-in camera represents the target market for the GPU-WEAR technology. In 2015, the wearable technologies market has gained significant traction within specific product categories, including activity and health monitors, imaging products (specifically action cameras) and smartwatches. However, despite the enormous market potential there are still only 6 companies in the world capable of addressing the market for mobile GPUs.
So far, there are numerous applications and products in the market, where GPU-WEAR technology can find its place. The GPU-WEAR heterogeneous multicore GPUs consists of two different types of microarchitectures: NEMA|t ( and NEMA|S ( The technical benefits of GPU-WEAR technology are vast and can lead to considerable commercial benefits for the company’s customers by saving money, increasing margins, and contributing to their end line.
Beyond the wearables/IoT market, GPU-WEAR will address in the next five years market opportunities such as: Mobile Cloud Computing, High Performance Computing, Cloud services and Datacenter, Intelligent Home, Digital Display Security Signage and Image (ISP) and post-camera processing. However, till the end of the GPU-WEAR project, we decided to focus our efforts towards penetrating to the global emerging market of machine learning (ML). In particular, we are currently focusing on developing a working prototype for an ultra-low power, Neural Network inference accelerator based on the GPU-WEAR technology targeting mainly the low-end and edge devices and applications.
USPTO Registered Trademark for Think Silicon GPUs