Corf: Coalescing Operand Register File For Graphical Processing Units

Tech ID: 32692 / UC Case 2019-123-0

Patent Status

Patent Pending

Background

Modern Graphical Processing Units (GPUs) consist of several Streaming Multiprocessors (SM) – each has its own Register File (RF) and a number of integers, floating points and specialized computational cores. GPU program is decomposed into one or more cooperative thread arrays that are scheduled to the SMs. GPUs invest in large RFs to enable fine-grained and fast switching between executing groups of threads. This results in RFs being the most power hungry components of the GPU. The RF organization substantially affects the overall performance and energy efficiency of the GPU.

Innovation

Prof. Nael Abu-Ghazaleh and his research team have designed a novel, patent pending architecture for register coalescing to improve performance and energy efficiency – called CORF. Register coalescing combines multiple register reads into a single physical register read. The proposed design takes advantage of the coalescing opportunities through a combination of compiler-guided register allocation and coalescing-aware register organization. To maximize operand coalescing opportunities, CORF combines compiler-assisted register allocation with a reorganized RF – called CORF++.

CORF++ Overview

CORF++ Overview. At compile time, the alignment of the register through graph coloring algorithm to maximize coalescing opportunities.

Advantages

The benefits of their invention are:

  • Allows multiple operands to be read in a single cycle, overcoming port serialization.
  • The pressure on the RF is reduced potentially reducing register bank conflicts.
  • Combined savings of 17% in dynamic energy, reduction in number of reads by 23%, improvement in instruction per cycle/computation by 9%, and 52% of the leakage energy.

Technique IPC Register reads RF Dynamic Energy RF Size
Register packing 1 1 1 0.65
Register packing + Virtualization 1 1 1 0.43
CORF 1.04 0.9 0.92 0.43
CORF++ 1.09 0.77 0.83 0.43

The table above summarizes the advantages of CORF, CORF++ and register packing (and register virtualization). All values normalized to the baseline GPU register file.

State Of Development

The design is fully prototyped in an architectural simulator (GPGPU-Sim). Some elements (e.g., hardware designs) have been further developed to evaluate complexity and energy efficiency.

Related Materials

Contact

Learn About UC TechAlerts - Save Searches and receive new technology matches

Other Information

Keywords

Computer systems organization, Architectures, Software, Compilers, Graphical Processing Units, GPU, Microarchitecture, Register file

Categorized As