On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators

Ardavan Pedram; Andreas Gerstlauer; Robert A. van de Geijn

On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators

Ardavan Pedram University of Texas
Andreas Gerstlauer University of Texas
Robert A. van de Geijn University of Texas

Resumo

Reducing power consumption and increasing efficiency is a key concern for many applications. How to design highly efficient computing elements while maintaining enough flexibility within a domain of applications is a fundamental question. In this paper, we present how broadcast buses can eliminate the use of power hungry multi-ported register files in the context of data-parallel hardware accelerators for linear algebra operations. We demonstrate an algorithm/architecture co-design for the mapping of different collective communication operations, which are crucial for achieving performance and efficiency in most linear algebra routines, such as GEMM, SYRK and matrix transposition. We compare a broadcast bus based architecture with conventional SIMD, 2D-SIMD and flat register file for these operations in terms of area and energy efficiency. Results show that fast broadcast data movement abilities in a prototypical linear algebra core can achieve up to 75× better power and up to 10× better area efficiency compared to traditional SIMD architectures.

Palavras-chave: Registers, Arrays, Symmetric matrices, Vectors, Hardware, Register-file, Broadcast bus, Matrix Multiply, Power efficiency, High performance computing

IEEE Xplore (English)

Publicado

24/10/2012

Como Citar

Selecione um Formato

PEDRAM, Ardavan; GERSTLAUER, Andreas; GEIJN, Robert A. van de. On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 24. , 2012, Nova Iorque/EUA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2012 . p. 19-26.