Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model
Resumo
The Roofline model is widely used to visualize the performance of executed code together with the upper performance bounds given by the memory bandwidth and the processor peak performance. The model can thus provide an insightful visualization of bottlenecks. In this paper, we try to establish realistic bandwidth ceilings for the sparse triangular solve step of PARDISO, a leading sparse direct solver package, which is also part of the Intel MKL library. The performance of the forward and backward substitution process is analyzed and benchmarked for a representative set of sparse matrices on seven modern x86-type multicore architectures and the Knights Landing manycore architecture. It is shown how to accurately measure the necessary quantities also for threaded code, and the measurement approach, its validation, as well as limitations are discussed. Our modeling approach covers the serial and parallel execution phases, allowing for in-socket performance predictions.
Palavras-chave:
Sparse matrices, Particle separators, Analytical models, Computational modeling, Computer architecture, Mathematical model, Hardware
Publicado
24/09/2018
Como Citar
WITTMANN, Markus; HAGER, Georg; JANALIK, Radim; LANSER, Martin; KLAWONN, Axel; RHEINBACH, Oliver; SCHENK, Olaf; WELLEIN, Gerhard.
Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 30. , 2018, Lyon/FR.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2018
.
p. 233-241.
