Avoiding Synchronization to Accelerate a CFD Solver in GPU

Ernesto Dufrechou; Pablo Ezzatti; Gabriel Usera

Ernesto Dufrechou Universidad de la República
Pablo Ezzatti Universidad de la República
Gabriel Usera Universidad de la República

Resumo

The caffa3d.MBRi is an open source, GPU-aware, general purpose incompressible flow solver, aimed at providing a useful tool for numerical simulation of real world fluid flow problems that require both geometrical flexibility and parallel computation capabilities to afford tens and hundreds million cells simulations. At the core of this tool there are a number of linear solvers that can be selected according to the characteristics of the problem to solve. For band matrices, the most efficient linear solver included in caffa3d.MBRi is the Strongly Implicit Procedure (SIP) solver. The parallelization of this solver follows the hyper-planes strategy, where the computations in one hyper-plane bare no dependencies and can be executed in parallel, while the hyper-planes have to be processed sequentially. In this work, we analyze this strategy to reach an efficient GPU implementation of the SIP solver for the caffa3d.MBRi. In particular, we design and implement a self-scheduling procedure to avoid the overhead of CPU-GPU synchronization implied by the hyper-planes strategy, outperforming the standard GPU implementation of the SIP by approximately 2×.

Palavras-chave: Graphics processing units, Kernel, Instruction sets, Synchronization, Computational modeling, Random access memory, Indexes, Graphics processors, Strongly Implicit Procedure, Computational fluid dynamics, Asynchronous computations