A Source-to-Source NUMA Profiling Approach

  • Letícia S. F. Machado UFSCar
  • Claude Tadonki CRI / ParisTech-PSL
  • Hermes Senger UFSCar


The design of HPC processors is driven by the purpose of packaging an increasing number of CPU cores. This trend in the multicore design faces the physical reality of integrating circuits into a single die in addition to the bottleneck of components sharing, thus the advent of Non-Uniform Memory Access (NUMA) with its typical packaging. Cutting-edge supercomputers are made up of such (manycore) compute nodes. In any case, the main issue is scalability. With a NUMA configuration, a memory access can be local (within the same NUMA node) or remote (from a NUMA node to another). The latter is the main concern w.r.t to efficiency because of the associated overhead is much more important. Dealing with this concern explicitly when designing a program is called NUMA-aware implementation. With an existing code, the problem can be addressed by starting with an appropriate profiling. This is the focus of the present work, where we suggest a way to instrument the native code in order to get the type (i.e. local or remote) of each memory access and we provide a tool that supports the profiling process. We then propose a metric that takes these statistics about memory accesses and provides a value indicating the potential associated overhead.
MACHADO, Letícia S. F.; TADONKI, Claude; SENGER, Hermes. A Source-to-Source NUMA Profiling Approach. In: WORKSHOP ON APPLICATIONS FOR MULTI-CORE ARCHITECTURES - INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 35. , 2023, Porto Alegre/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 54-59.