Energy Consumption Estimation in Parallel Applications: an Analysis in Real and Theoretical Models
Resumo
This paper presents a detailed energy consumption analysis, considering the energy consumption related to CPU, cache memory and main memory of parallel applications on HPC systems. Furthermore, this paper also presents the correlation between energy consumption, Speedup, and execution time. Experiments are conducted with the NAS parallel benchmarks using three different measurement tools: 1) Intel PCM, 2) Perf Linux, and 3) HP CACTI. The results show a comparison between two approaches to obtain energy consumption results. One using PCM and other using Perf and CACTI. The DRAM results show an average variation between these approaches of 47% for sequential applications, and 19% for parallel applications. The system results show that the lowest energy consumption occurs only when all physical cores are used, showing that the hyper-threading system did not bring benefits in energy consumption to the system. Moreover, the cache memories results show that the cache miss rate (regardless of the level) increases with the number of threads. However, a parallel application has lower cache memory energy consumption when compared to its sequential version.Referências
J. Mair, Z. Huang, D. Eyers and Y. Chen, "Quantifying the Energy Efciency Challenges of Achieving Exascale Computing", IEEE International Symposium on Cluster, Cloud and Grid Computing, pp. 943-950, 2015.
R. Gioiosa, D. Kerbyson, and A. Hoisie, "Evaluating performance and power efciency of scientic applications on multi-threaded systems", International Workshop on Energy Efcient Supercomputing, pp. 11-20, 2014.
R. Ge, X. Feng, S. Song, H. C. Chang, D. Li and K. W. Cameron, "PowerPack: Energy Proling and Analysis of High-Performance Systems and Applications", IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 5, pp. 658-671, 2010.
A. F. Lorenzon, A. L. Sartor, M. C. Cera and A. C. S. Beck, "Optimized Use of Parallel Programming Interfaces in Multithreaded Embedded Architectures", IEEE Computer Society Annual Symposium on VLSI, pp. 410-415, 2015.
M. A. Suleman, M. K. Qureshi and Y. N. Patt, "Feedback-driven threading: power-efcient and high-performance execution of multi-threaded workloads on CMPs", International Conference on Architectural Support for Programming Lang. and Oper., pp. 277-286, 2008.
"NAS Parallel Benchmarks", http://www.nas.nasa.gov/publications/npb. html, June 2016.
"Intel Performance Counter Monitor - A better way to measure CPU utilization", http://www.intel.com/software/pcm, June 2016.
"Linux Perf tool", https://perf.wiki.kernel.org/, June 2016.
"CACTI 6.5", http://www.hpl.hp.com/research/cacti, June 2016.
J. L. Gustafson, "Fixed Time, Tiered Memory, and Superlinear Speedup", Distributed Memory Computing Conference, pp. 1255-1260, 1990.
X.-H. Sun and L. M. Ni, "Scalable problems and memory-bounded speedup", Journal of Parallel and Distributed Computing, vol. 19, pp. 27-37, 1993.
S. Song, C. Y. Su, R. Ge, A. Vishnu and K. W. Cameron, "Iso-Energy-Efciency: An Approach to Power-Constrained Parallel Computation", IEEE International Parallel & Distributed Processing Symposium, pp. 128-139, 2011.
J. Balladini, R. Suppi, D. Rexachs and E. Luque, "Impact of parallel programming models and CPUs clock frequency on energy consumption of HPC systems", International Conference on Computer Systems and Applications, pp. 16-21, 2011.
A. K. Portereld, S. L. Olivier, S. Bhalachandra and J. F. Prins, "Power Measurement and Concurrency Throttling for Energy Reduction in OpenMP Programs", International Parallel and Distributed Processing Symposium Workshops, pp. 884-891, 2013.
H. Jacobson, P. Bose, G. Wei, and D. Brooks, "Quantifying Sources of Error in McPAT and Potential Impacts on Architectural Studies", in 21st International Symposium on High Performance Computer Architecture (HPCA), 21st. IEEE, 2015.
"McPAT", http://www.hpl.hp.com/research/mcpat, June 2016.
R. Gioiosa, D. Kerbyson, and A. Hoisie, "Evaluating performance and power efciency of scientic applications on multi-threaded systems", International Workshop on Energy Efcient Supercomputing, pp. 11-20, 2014.
R. Ge, X. Feng, S. Song, H. C. Chang, D. Li and K. W. Cameron, "PowerPack: Energy Proling and Analysis of High-Performance Systems and Applications", IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 5, pp. 658-671, 2010.
A. F. Lorenzon, A. L. Sartor, M. C. Cera and A. C. S. Beck, "Optimized Use of Parallel Programming Interfaces in Multithreaded Embedded Architectures", IEEE Computer Society Annual Symposium on VLSI, pp. 410-415, 2015.
M. A. Suleman, M. K. Qureshi and Y. N. Patt, "Feedback-driven threading: power-efcient and high-performance execution of multi-threaded workloads on CMPs", International Conference on Architectural Support for Programming Lang. and Oper., pp. 277-286, 2008.
"NAS Parallel Benchmarks", http://www.nas.nasa.gov/publications/npb. html, June 2016.
"Intel Performance Counter Monitor - A better way to measure CPU utilization", http://www.intel.com/software/pcm, June 2016.
"Linux Perf tool", https://perf.wiki.kernel.org/, June 2016.
"CACTI 6.5", http://www.hpl.hp.com/research/cacti, June 2016.
J. L. Gustafson, "Fixed Time, Tiered Memory, and Superlinear Speedup", Distributed Memory Computing Conference, pp. 1255-1260, 1990.
X.-H. Sun and L. M. Ni, "Scalable problems and memory-bounded speedup", Journal of Parallel and Distributed Computing, vol. 19, pp. 27-37, 1993.
S. Song, C. Y. Su, R. Ge, A. Vishnu and K. W. Cameron, "Iso-Energy-Efciency: An Approach to Power-Constrained Parallel Computation", IEEE International Parallel & Distributed Processing Symposium, pp. 128-139, 2011.
J. Balladini, R. Suppi, D. Rexachs and E. Luque, "Impact of parallel programming models and CPUs clock frequency on energy consumption of HPC systems", International Conference on Computer Systems and Applications, pp. 16-21, 2011.
A. K. Portereld, S. L. Olivier, S. Bhalachandra and J. F. Prins, "Power Measurement and Concurrency Throttling for Energy Reduction in OpenMP Programs", International Parallel and Distributed Processing Symposium Workshops, pp. 884-891, 2013.
H. Jacobson, P. Bose, G. Wei, and D. Brooks, "Quantifying Sources of Error in McPAT and Potential Impacts on Architectural Studies", in 21st International Symposium on High Performance Computer Architecture (HPCA), 21st. IEEE, 2015.
"McPAT", http://www.hpl.hp.com/research/mcpat, June 2016.
Publicado
05/10/2016
Como Citar
SOLVEIRA, Dieison; MORO, Gabriel; DE CRUZ, Eduardo; NAVAUX, Philipe; SCHNORR, Lucas; BAMPI, Sergio.
Energy Consumption Estimation in Parallel Applications: an Analysis in Real and Theoretical Models. In: SIMPÓSIO EM SISTEMAS COMPUTACIONAIS DE ALTO DESEMPENHO (SSCAD), 17. , 2016, Aracajú.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2016
.
p. 73-84.
DOI: https://doi.org/10.5753/wscad.2016.14249.