AtTune: A Heuristic based Framework for Parallel Applications Autotuning

  • Hiago Rocha UFRGS
  • Janaina Schwarzrock UFRGS
  • Monica Pereira UFRN
  • Lucas Schnorr UFRGS
  • Philippe Navaux UFRGS
  • Arthur Lorenzon Unipampa
  • Antonio Carlos Schneider Beck Filho UFRGS

Resumo


Several aspects limit the scalability of parallel applications, e.g., off-chip bus saturation and data synchronization. Moreover, the high cost of cooling HPC systems, which can outweigh the cost of developing the system itself, has pushed the parallel application’s execution to another level of requirements, in terms of performance and energy. In this work, we propose AtTune: a heuristic-based framework for tuning the number of processes/threads and CPU frequency to optimize the parallel applications’ execution. AtTune is transparent for the user, independent of the input size, and it optimizes for different parallel programming models. We evaluated our proposed solution considering five well-known kernels implemented in MPI and OpenMP. Experimental results on two real multi-core systems showed that AtTune improves up to 36%, 11%, and 32% the energy efficiency, performance, and Energy-Delay Product, respectively.

Palavras-chave: Automatic Tuning, Transparent Optimization, Thread Throttling, DVFS, Thread-Level Parallelism, Energy efficiency

Referências

A. F. Lorenzon, M. C. Cera, and A. C. S. Beck, “On the influence of static power consumption in multicore embedded systems,” in ISCAS. IEEE, 2015, pp. 1374–1377.

P.-F. Dutot, Y. Georgiou, D. Glesser, L. Lefevre, M. Poquet, and I. Rais, “Towards energy budget control in hpc,” in CCGRID. IEEE, 2017, pp. 381–390.

A. F. Lorenzon, C. C. De Oliveira, J. D. Souza, and A. C. S. Beck, “Aurora: Seamless optimization of openmp applications,” TPDS, vol. 30, no. 5, pp. 1007–1021, 2018.

A. F. Lorenzon and A. C. S. Beck Filho, Parallel Computing Hits the Power Wall: Principles, Challenges, and a Survey of Solutions. Springer Nature, 2019.

C. C. De Oliveira, A. F. Lorenzon, and A. C. S. Beck, “Automatic tuning tlp and dvfs for edp with a non-intrusive genetic algorithm framework,” in SBESC. IEEE, 2018, pp. 46–153.

D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber et al., “The nas parallel benchmarks,” IJSA, vol. 5, no. 3, pp. 63–73, 1991.

E. L. Padoin, M. Diener, P. O. Navaux, and J.-F. Ḿehaut, “Managing power demand and load imbalance to save energy on systems with heterogeneous cpu speeds,” in SBAC-PAD. IEEE, 2019, pp. 72–79.

J. Li and J. F. Martinez, “Dynamic power-performance adaptation of parallel computation on chip multiprocessors,” in HPCA, 2006. IEEE, 2006, pp. 77–87.

D. Li, B. R. de Supinski, M. Schulz, K. Cameron, and D. S. Nikolopoulos, “Hybrid mpi/openmp power-aware computing,” in IPDPS. IEEE, 2010, pp. 1–12.

G. Chadha, S. Mahlke, and S. Narayanasamy, “When less is more (limo): controlled parallelism for improved efficiency,” in CASES, 2012, pp. 141–150.

F. Alessi, P. Thoman, G. Georgakoudis, T. Fahringer, and D. S. Nikolopoulos, “Application-level energy awareness for openmp,” in IWOMP. Springer, 2015, pp. 219–232.

A. Marathe, P. E. Bailey, D. K. Lowenthal, B. Rountree, M. Schulz, and B. R. de Supinski, “A run-time system for power-constrained hpc applications,” in HiPC. Springer, 2015, pp. 394–408.

S. K. Gutierrez, N. T. Hjelm, M. G. Venkata, and R. L. Graham, “Performance evaluation of open mpi on cray xe/xk systems,” in HOTI. IEEE, 2012, pp. 40–47.
Publicado
23/11/2020
ROCHA, Hiago; SCHWARZROCK, Janaina; PEREIRA, Monica; SCHNORR, Lucas; NAVAUX, Philippe; LORENZON, Arthur; BECK FILHO, Antonio Carlos Schneider. AtTune: A Heuristic based Framework for Parallel Applications Autotuning. In: TRABALHOS EM ANDAMENTO - SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SISTEMAS COMPUTACIONAIS (SBESC), 10. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 151-156. ISSN 2763-9002. DOI: https://doi.org/10.5753/sbesc_estendido.2020.13105.