Exploiting Limited Access Distance for Kernel Fusion Across the Stages of Explicit One-Step Methods on GPUs

  • Matthias Korch University of Bayreuth
  • Tim Werner University of Bayreuth

Resumo


The performance of explicit parallel methods solving large systems of ordinary differential equations (ODEs) on GPUs is often memory bound. Therefore, locality optimizations, such as kernel fusion, are desirable. This paper exploits a special property of a large class of right-hand-side (RHS) functions to enable the fusion of computations of blocks of components across multiple stages of the method. This leads to a tiling of the stages within one time step. Our approach is based on a representation of the ODE method by a data flow graph and allows efficient GPU code with fused kernels to be generated automatically for user-defined tilings. In particular, we investigate two generalized tiling strategies, trapezoidal and hexagonal tiling, which are evaluated experimentally for several different high-order Runge-Kutta (RK) methods.
Palavras-chave: Kernel, Graphics processing units, Optimization, Computer architecture, Indexes, Registers, Instruction sets, GPU, GPGPU, ODE, one-step, explicit, Runge-Kutta, limited access distance, kernel fusion, hexagonal tiling, trapezoidal tiling
Publicado
24/09/2018
KORCH, Matthias; WERNER, Tim. Exploiting Limited Access Distance for Kernel Fusion Across the Stages of Explicit One-Step Methods on GPUs. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 30. , 2018, Lyon/FR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 148-157.