NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors

  • Sandra Catalán Universidad Complutense de Madrid
  • Francisco D. Igual Universidad Complutense de Madrid
  • Rafael Rodríguez-Sánchez Universidad Complutense de Madrid
  • José R. Herrero Universidad Complutense de Madrid
  • Enrique S. Quintana-Ortí Universitat Politécnica de Catalunya

Resumo


We address the efficient design and implementation of dense matrix factorizations and inversion (DMFI) on modern multicore processors with several NUMA (non-uniform memory access) nodes. Our approach enhances the DMFI routines with a look-ahead strategy, in order to overcome the “panel factorization bottleneck”. In addition, it exploits both hybrid task- and loop-level parallelizations while taking into account the NUMA organization of the memory hierarchy. The experiments on a Huawei Kunpeng-based server, with two sockets and 48 cores per socket, for three representative dense linear algebra operations, expose the necessity of adapting both the codes and their execution environment parameters to improve data access locality. The results of these changes deliver performance across inter- and intra-socket NUMA configurations superior to that of reference implementations from state-of-the-art libraries for this platform.
Palavras-chave: Multicore processors, NUMA, dense linear algebra, look-ahead, multi-threaded parallelism, high performance
Publicado
02/11/2022
CATALÁN, Sandra; IGUAL, Francisco D.; RODRÍGUEZ-SÁNCHEZ, Rafael; HERRERO, José R.; QUINTANA-ORTÍ, Enrique S.. NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 34. , 2022, Bordeaux/France. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 91-99.