NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors

  • Sandra Catalán Universidad Complutense de Madrid
  • Francisco D. Igual Universidad Complutense de Madrid
  • Rafael Rodríguez-Sánchez Universidad Complutense de Madrid
  • José R. Herrero Universidad Complutense de Madrid
  • Enrique S. Quintana-Ortí Universitat Politécnica de Catalunya

Resumo

We address the efficient design and implementation of dense matrix factorizations and inversion (DMFI) on modern multicore processors with several NUMA (non-uniform memory access) nodes. Our approach enhances the DMFI routines with a look-ahead strategy, in order to overcome the “panel factorization bottleneck”. In addition, it exploits both hybrid task- and loop-level parallelizations while taking into account the NUMA organization of the memory hierarchy. The experiments on a Huawei Kunpeng-based server, with two sockets and 48 cores per socket, for three representative dense linear algebra operations, expose the necessity of adapting both the codes and their execution environment parameters to improve data access locality. The results of these changes deliver performance across inter- and intra-socket NUMA configurations superior to that of reference implementations from state-of-the-art libraries for this platform.
Publicado
2022-11-02
Como Citar
CATALÁN, Sandra et al. NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors. Anais do International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), [S.l.], p. 91-99, nov. 2022. ISSN 0000-0000. Disponível em: <https://sol.sbc.org.br/index.php/sbac-pad/article/view/28236>. Acesso em: 17 maio 2024.