Country | Type | Number | Dated | Case |
United States Of America | Published Application | 20220100484 | 03/31/2022 | 2019-123 |
Modern Graphical Processing Units (GPUs) consist of several Streaming Multiprocessors (SM) – each has its own Register File (RF) and a number of integers, floating points and specialized computational cores. GPU program is decomposed into one or more cooperative thread arrays that are scheduled to the SMs. GPUs invest in large RFs to enable fine-grained and fast switching between executing groups of threads. This results in RFs being the most power hungry components of the GPU. The RF organization substantially affects the overall performance and energy efficiency of the GPU.
Prof. Nael Abu-Ghazaleh and his research team have designed a novel, patent pending architecture for register coalescing to improve performance and energy efficiency – called CORF. Register coalescing combines multiple register reads into a single physical register read. The proposed design takes advantage of the coalescing opportunities through a combination of compiler-guided register allocation and coalescing-aware register organization. To maximize operand coalescing opportunities, CORF combines compiler-assisted register allocation with a reorganized RF – called CORF++.
CORF++ Overview. At compile time, the alignment of the register through graph coloring algorithm to maximize coalescing opportunities.
The benefits of their invention are:
Technique
IPC
Register reads
RF Dynamic Energy
RF Size
Register packing
1
1
1
0.65
Register packing + Virtualization
1
1
1
0.43
CORF
1.04
0.9
0.92
0.43
CORF++
1.09
0.77
0.83
0.43
The table above summarizes the advantages of CORF, CORF++ and register packing (and register virtualization). All values normalized to the baseline GPU register file.
The design is fully prototyped in an architectural simulator (GPGPU-Sim). Some elements (e.g., hardware designs) have been further developed to evaluate complexity and energy efficiency.
Computer systems organization, Architectures, Software, Compilers, Graphical Processing Units, GPU, Microarchitecture, Register file