A tool for profiling memory accesses locality on NUMA architectures
ResumoIn this paper, we studied the NUMA architecture and memory access patterns presented by applications executing in this architecture. We developed a tool to parse the source code of a target application and produce traces about which memories of the NUMA node were accessed and in which order. Based on these traces, we expect to get insights into strategies to optimize the performance of the target applications. We will apply this tool to improve the performance of a wave equation simulator which is the kernel of a seismic imaging application for the Oil and Gas industry.
Kaestle, S., Achermann, R., and Roscoe, T. (2015). Shoal: smart allocation and replication of memory for parallel programs. In USENIX Annual Tech. Conf., page 8–10.
Lin, P., Yi, Q., Quinlan, D., Liao, C., and Yan, Y. (2016). Automatically Optimizing Stencil Computations on Many-core NUMA Architectures. In International Workshop on Languages and Compilers for Parallel Computing.
Paxson, V., Estes, W., and Millaway, J. (2007). Lexical analysis with flex. University of California, page 28.