Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection
Resumo
Sparse Matrix-Vector Multiplication (SpMV) is one of the key memory-bound kernels commonly used in industrial and scientific applications. To improve its data movement and benefit from higher compute rates, there are several efforts to utilize mixed precision on SpMV. Most of the prior-art focus on performing the entire SpMV in single-precision within a bigger context of an iterative solver (e.g., CG, GMRES). In this work, we are interested in a more fine-grained mixed-precision SpMV, where the level of precision is decided for each element in the matrix to be used in a single operation. We extend an existing entry-wise precision based approach by deciding precisions per row, motivated by the granularity of parallelism on a GPU where groups of threads process rows in CSR-based matrices. We propose mixed-precision CSR storage methods with row permutations and describe their greater efficiency and load-balancing compared to the existing method. We also consider a multi-precision case where single and double precision copies of the matrix are stored priorly and further extend our mixed-precision SpMV approach to comply with it. As such, we leverage a mixed-precision SpMV to obtain a multi-precision Jacobi method which is faster than yet almost as accurate as double-precision Jacobi implementation, and further evaluate a multi-precision Cardiac modeling algorithm. We demonstrate the effectiveness of the proposed SpMV methods on an extensive dataset of real-valued large sparse matrices from the SuiteSparse Matrix Collection using an NVIDIA V100 GPU.
Palavras-chave:
spmv, mixed-precision, gpu, cuda
Publicado
02/11/2022
Como Citar
TEZCAN, Erhan; TORUN, Tugba; KOŞAR, Fahrican; KAYA, Kamer; UNAT, Didem.
Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 34. , 2022, Bordeaux/France.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2022
.
p. 31-40.