Performance Improvements of Parallel Applications thanks to MPI-4.0 Hints

  • Maxim Moraru LICIIS / Université de Reims Champagne Ardenne
  • Adrien Roussel CEA / DAM / DIF / LRC DIGIT / Université Paris-Saclay
  • Hugo Taboada CEA / DAM / DIF / LRC DIGIT / Université Paris-Saclay
  • Christophe Jaillet LICIIS / LRC DIGIT / Université de Reims Champagne Ardenne
  • Marc Pérache CEA / DAM / DIF / LRC DIGIT / Université Paris-Saclay
  • Michael Krajecki LICIIS / LRC DIGIT / Université de Reims Champagne Ardenne

Resumo

HPC systems have experienced significant growth over the past years, with modern machines having hundreds of thousands of nodes. Message Passing Interface (MPI) is the de facto standard for distributed computing on these architectures. On the MPI critical path, the message-matching process is one of the most time-consuming operations. In this process, searching for a specific request in a message queue represents a significant part of the communication latency. So far, no miracle algorithm performs well in all cases. This paper explores potential matching specializations thanks to hints introduced in the latest MPI 4.0 standard. We propose a hash-table-based algorithm that performs constant time message-matching for no wildcard requests. This approach is suitable for intensive point-to-point communication phases in many applications (more than 50% of CORAL benchmarks). We demonstrate that our approach can improve the overall execution time of real HPC applications by up to 25%. Also, we analyze the limitations of our method and propose a strategy for identifying the most suitable algorithm for a given application. Indeed, we apply machine learning techniques for classifying applications depending on their message pattern characteristics.
Publicado
2022-11-02
Como Citar
MORARU, Maxim et al. Performance Improvements of Parallel Applications thanks to MPI-4.0 Hints. Anais do International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), [S.l.], p. 273-282, nov. 2022. ISSN 0000-0000. Disponível em: <https://sol.sbc.org.br/index.php/sbac-pad/article/view/28254>. Acesso em: 17 maio 2024.