Periodic Reporting for period 2 - eProcessor (European, extendable, energy-efficient, energetic, embedded, extensible, Processor Ecosystem)
Período documentado: 2022-10-01 hasta 2024-03-31
•eProcessor technology is based on the RISC-V ISA and it features high performance computing and data analytics accelerators coupled to a high performance, low energy out-of-order processor (Europe's first high performance out-of- order 64-bit RISC-V platform). This is a major first step in the direction of an open European software/hardware ecosystem, which will guarantee technology independence.
•eProcessor will meet the performance and energy requirements of new and existing HPC applications by co-designing solutions that provide high performance, low-power, and fault tolerance. Uniquely, we can specialize all components of the system in the context of a broad application domain: a combination of energy efficient accelerators, adaptive on-chip memory structures, and a flexible and high performance energy-efficient CPU, with the corresponding open software stack.
•eProcessor uses a diverse set of applications in the HPC, artificial intelligence, deep learning, machine learning, and bioinformatics domains to drive the design of the overall system.
•Many applications use sparse data sets and/or low/mixed-precision. Instead of focusing on the peak performance of dense computations, eProcessor targets a broader collection of applications by developing a system targeting sustained application performance.
•eProcessor partners are leveraging their existing IP from multiple EU projects such as EPI, LEGaTO, MEEP, POP2 CoE, Tulipp, EuroEXA, ExaNeSt and DeepHealth, extending their capabilities and improving their Technical Readiness Level (TRL). In addition, eProcessor is collaboraing with two other project of the EuroHPC program, SparCity and RED-SEA.
•eProcessor combines industry standard methodology and cutting-edge research to accelerate exploitation. Traditionally, academic hardware projects lack the rigor required in industry. eProcessor extends traditional pedagogy into this new domain of high-performance hardware design and as a result, this project will deliver silicon-proven IP (higher TRL) that will provide a faster time-to-market and, as a result, higher potential for exploitation. The adoption of the IP from this proposal will be much higher than any other simulation- or emulation- only proposal because of the silicon-proven energy-efficient IP funded through this proposal.
•Specification of the whole eProcessor system, including architecture, emulation and implementation environment, operating system, system software, compiler, performance tools, and application use cases.
•Design and implemention of fully functional IP blocks for the NoC, the L2 caches, the AI accelerator based on systolic arrays, the mixed-precision functional units, the I/o devices and the peripherals.
•Design and implementation of a vector processing unit that executes SIMD instructions and has a direct path to memory.
•Integration and verification of the out-of-order core, the vector processing unit and the rest of IPs into two eProcessor architectures, one single-core and one multi-core.
•Development of a gem5-based simulation environment and a thorough performance evaluation of the eProcessor architecture with many different parameters.
•Development of the FPGA prototypes of the two eProcessor architectures, both able to successfully boot Linux and to execute applications.
•Complete synthesis, place and route and physical design of the single-core eProcessor architecture.
•Fabrication of the single-core eProcessor ASIC by Global Foundries with a 22nm technology node. The chip packaging is currently ongoing.
•Complete specification of the PCB, the pinout and the package for the single-core eProcessor ASIC.
•Porting and patching of the Linux kernel to the eProcessor FPGA prototypes.
•Porting of alternative operating systems such as no-MMU Linux, Zephyr and OpenAMP to the eProcessor FPGA prototypes.
•Porting and optimizing libraries for the eProcessor ecosystem, including efficient resource management techniques in OpenMP and software support for fault tolerance.
•Development of a LLVM compiler that can generate RISC-V SIMD code, including novel compiler support for mixed- and low-precision floating point operations.
•Porting of performance and debug tools to different RISC-V SDVs.
•Specification of the application use cases, porting to RISC-V, and clear definition of the plan for optimizing them on the eProcessor architecture.
•Development of a complete suite of microbenchmarks to evaluate the different components of the eProcessor architecture.
•Improve machine learning accelerators by developing arithmetic units to support a wide range of reduced and mixed precision as well as explore new formats for reduced precision floating-point training.
•Improve application performance using cooperative adaptive on-chip memories.
•Devise a coherent network on chip to interconnect the CPU with the accelerators.
•Optimize and extend the OpenMP runtime and the compiler to leverage the resource management knobs of the eProcessor platform. This will allow OpenMP to guide cache coherence optimizations and to implement energy-efficient scheduling and synchronization.
•Design and integrate novel applications with hardware accelerators for artifical intelligence, machine learning, deep learning, and bioinformatics.
•Provide fault tolerance for critical processor structures with various error detection strengths (parity or lightweight ECC) and software support for efficient error recovery.
Beyond the advancements to the state-of-the-art, the eProcessor project has tremendous potential for innovation, particularly around the wide-spread adoption and rapid evolution of an ecosystem based on open hardware with the RISC-V ISA. RISC-V CPU technologies are forecast to exhibit a compound annual growth rate of 146.2% on average between 2018 and 2025, exceeding 62 billion deployed cores by 2025, over a range of market segments including the computer, consumer, communication, transportation, and industrial markets.
In addition to this phenomenal growth rate, the eProcessor consortium recognizes viable opportunities for innovation as driven by two ongoing developments: (i) the convergence of computing platform requirements across HPC, artifical intelligence, machine learning, deep learning workloads, and bioinformatics; and (ii) the ever-growing demand for computational power by diverse application workflows. Power constraints necessitate a focus on efficient use of highly heterogeneous platform resources. To this end, the eProcessor project provides a roadmap for bringing key innovations towards the development of an open European full stack ecosystem based on a new RISC-V CPU coupled to multiple diverse accelerators.