Dynasor: A Dynamic Memory Layout for Accelerating Sparse MTTKRP for Tensor Decomposition on Multi-core CPU

  • Sasindu Wijeratne University of Southern California
  • Rajgopal Kannan DEVCOM Army Research Lab
  • Viktor Prasanna University of Southern California

Resumo


Sparse Matricized Tensor Times Khatri-Rao Prod-uct (spMTTKRP) is the most time-consuming compute kernel in sparse tensor decomposition. In this paper, we introduce a novel algorithm to minimize the execution time of spMTTKRP across all modes of an input tensor on multi-core CPU plat-form. The proposed algorithm leverages the FLYCOO tensor format to exploit data locality in external memory accesses. It effectively utilizes computational resources by enabling lock-free concurrent processing of independent partitions of the input tensor. The proposed partitioning ensures load balancing among CPU threads. Our dynamic tensor remapping technique leads to reduced communication overhead along all the modes. On widely used real-world tensors, our work achieves 2.12x - 9.01x speedup in total execution time across all modes compared with the state-of-the-art CPU implementations.
Palavras-chave: Tensor Decomposition, spMTTKRP, CPU
Publicado
17/10/2023
Como Citar

Selecione um Formato
WIJERATNE, Sasindu; KANNAN, Rajgopal; PRASANNA, Viktor. Dynasor: A Dynamic Memory Layout for Accelerating Sparse MTTKRP for Tensor Decomposition on Multi-core CPU. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 35. , 2023, Porto Alegre/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 23-33.