Overcoming Memory-Capacity Constraints in the Use of ILUPACK on Graphics Processors

José I. Aliaga; Ernesto Dufrechou; Pablo Ezzatti; Enrique S. Quintana-Ortí

José I. Aliaga Universidad Jaime I
Ernesto Dufrechou Universidad de la República
Pablo Ezzatti Universidad de la República
Enrique S. Quintana-Ortí Universidad de la República

Resumo

An important number of scientific and engineering problems currently require the solution of large and sparse linear systems of equations. In previous work, we applied a GPU accelerator to the solution of sparse linear systems of moderate dimension via ILUPACK, showing important reductions in the execution time while maintaining the quality of the solution. Unfortunately, the use of GPUs attached to only one compute node strongly limits the memory available to solve the systems, and thus the size of the problems that can be tackled with this approach. In this work we introduce a distributed-parallel version of ILUPACK that overcomes these limitations. The results of the evaluation show that the inclusion of multiple GPUs, located on distinct nodes of a cluster, yields relevant reductions in the execution time for large problems and, more importantly, allows to increase the dimension of the problems, showing interesting scaling properties.

Palavras-chave: Linear systems, Integrated circuits, Graphics processing units, Mathematical model, Parallel processing, Computer architecture, Sparse linear systems, distributed memory platforms, conjugate gradient (CG) method, incomplete LU factorization, graphics processors (GPUs)