Periodic Reporting for period 2 - LPGPU2 (Low-Power Parallel Computing on GPUs 2)
Período documentado: 2017-04-01 hasta 2018-09-30
The main goal of the LPGPU2 project was to help developers create software for low-power GPUs by providing a complete performance and power analysis framework addressing the power problem from different angles: by defining data collection standards, reliably measuring and estimating power consumption and developing a tool suite that provides rich visualizations and insights to software developers. The LPGPU2 tool suite was built on top of the open source project CodeXL (https://gpuopen.com/compute-product/codexll(se abrirá en una nueva ventana)) and it consists of four main elements: data capture, data visualization, data analysis, and power estimation and measurements.
The main objectives of this project and the corresponding conclusions are:
1. To help programmers to improve the energy efficiency of compute and graphics applications for existing and emerging APIs. The LPGPU2 tool chain is equipped with a smart Feedback Engine aimed at making optimizations simple by providing insightful guidance to the user on how to improve performance and power consumption. The LPGPU2 tool suite has been validated using applications based on four existing and emerging APIs (OpenCL, SYCL, OpenGL, and Vulkan) that contain demanding graphics and compute parts.
2. To enable programmers to be able to write their software once and run it on a variety of different low-power GPUs. The project has set forward a standard interface for data collection. Establishing standard interfaces enhances also the portability of the applications across multiple platforms and standards. Moreover, the LPGPU2 tool is equipped with analysis modes able to support optimizations for four standards of the Khronos group; obviously, the use of open standards is also contributing to the portability objective.
3. To increase the productivity in GPU software development. In order to achieve this objective, the consortium decided to release the tool suite as open-source. In this way, even SMEs with limited resources have access to a sophisticated tool and enjoy the benefits of larger or more financially capable companies.
4: To reduce the hardware, software, and device driver design and development cycles of mobile GPUs. The LPGPU2 project offers a vertical toolchain. The term vertical means that the tool is able to gather (via a standardized interface) information from the GPU hardware, GPU driver, API, and the application levels and visualize this information in a seamless fashion. As such, the toolchain can be considered as a central point of reference. Consequently, it can be used as a tool to facilitate the communication between different design teams reducing in this way the long development cycles of mobile GPUs.
5. To bring technologies to market in a commercializable form, including productizing and commercializing the technologies developed in previous LPGPU (FP7 STREP) project. This includes i) bringing the SYCL standard into real-world AI applications generating commercial interest, ii) putting optimized video decoders into commercial video playaback systems, iii) increasing the competitive features of Think Silicon Nema GPUs by enhancing them with smart performance/power monitoring capabilities, and iv) increasing the TRL of the LPGPU power measurement testbed and bringing it closer to a commercial product.
On the applications side, Samsung has also developed a range of applications showcasing font rendering, augmented reality as well as virtual reality. These will be further optimized using the LPGPU2 tool suite and help improve Samsung's mobile graphics platform, which is used by millions of people worldwide. Think Silicon has developed a set of Image Signal Processing (ISP) applications using Vulkan and the NemaGFX API. An FPGA prototype has been implemented and the NemaGFX version of the ISP algorithms has been demonstrated at industrial exhibitions. Spin Digital has developed a complete media player using its H.265 codec and a new high-performance video rendering engine that uses the latest graphics APIs (Vulkan, DX12) and allows for the creation of next generation media playback applications (Ultra-HD support, HDR, etc). These were demonstrated at the world's largest media industry exhibitions: NAB (Las Vegas), IBC (Amsterdam), and InterBEE (Tokyo). Codeplay has ported the TensorFlow machine learning framework to OpenCL via SYCL so that the most-used AI framework in the world can run on any energy-efficient AI accelerator that supports OpenCL.