Otimizando a Execução de Aplicações Paralelas em Ambiente de Nuvem Heterogênea

Everton C. de Lima; Marcelo C. Luizelli; Fábio Rossi; Antonio Carlos S. Beck; Arthur F. Lorenzon

doi:10.5753/wscad.2022.226378

Everton C. de Lima UNIPAMPA
Marcelo C. Luizelli UNIPAMPA
Fábio Rossi IFFar
Antonio Carlos S. Beck UFRGS
Arthur F. Lorenzon UNIPAMPA

DOI: https://doi.org/10.5753/wscad.2022.226378

Resumo

A computação na nuvem emerge como uma plataforma alternativa para a execução de aplicações de alto desempenho. Simultaneamente, a atualização de nodos computacionais nestes sistemas pode levar a uma heterogeneidade de recursos. Neste sentido, o desafio de executar aplicações paralelas na nuvem não está apenas relacionado a definição do melhor número de threads para a aplicação, mas também, a escolha ideal da arquitetura que irá executar tal aplicação. No entanto, as características de grau de paralelismo e capacidade computacional têm sido pouco exploradas para fazer a alocação de aplicações numa nuvem heterogênea. Portanto, neste artigo, mostramos que ao considerar o grau de paralelismo de uma aplicação e as características do nodo computacional, ganhos significativos de desempenho e consumo de energia podem ser obtidos quando comparado a maneira padrão com que aplicações são escalonadas num ambiente de nuvem heterogênea.

Referências

Bailey, D. H., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Dagum, L., Fatoohi, R. A., Frederickson, P. O., Lasinski, T. A., Schreiber, R. S., Simon, H. D., Venkatakrishnan, V., and Weeratunga, S. K. (1991). The nas parallel benchmarks and summary and preliminary results. In ACM/IEEE SC, pages 158-165, USA. ACM.

Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., and Warfield, A. (2003). Xen and the art of virtualization. SIGOPS Oper. Syst. Rev., 37(5):164-177.

Charr, J.-C., Couturier, R., Fanfakh, A., and Giersch, A. (2015). Energy consumption reduction with dvfs for message passing iterative applications on heterogeneous architectures. In 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pages 922-931. IEEE.

Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J. W., Lee, S.-H., and Skadron, K. (2009). Rodinia: A benchmark suite for heterogeneous computing. In IEEE Int. Symp. on Workload Characterization, pages 44-54, DC, USA. IEEE Computer Society.

Chen, J.-J., Huang, K., and Thiele, L. (2012). Dynamic frequency scaling schemes for heterogeneous clusters under quality of service requirements. Journal of Information Science and Engineering, 28(6):1073-1090.

da Silva, V. S., Nogueira, A. G., de Lima, E. C., de A. Rocha, H. M., Serpa, M. S., Luizelli, M. C., Rossi, F. D., Navaux, P. O., Beck, A. C. S., and Francisco Lorenzon, A. (2021). Smart resource allocation of concurrent execution of parallel applications. Concurrency and Computation: Practice and Experience, page e6600.

Dongarra, J., Heroux, M. A., and Luszczek, P. (2015). Hpcg benchmark: A new metric for ranking high performance computing systems. Knoxville, Tennessee.

Hackenberg, D., Ilsche, T., Schone, R., Molka, D., Schmidt, M., and Nagel, W. E. (2013). Power measurement techniques on standard compute nodes: A quantitative comparison. In IEEE ISPASS, pages 194-204.

Ham, T. J., Chelepalli, B. K., Xue, N., and Lee, B. C. (2013). Disintegrated control for energy-efficient and heterogeneous memory systems. In IEEE HPCA, pages 424-435.

Huang, J., Xiao, C., and Wu, W. (2020). Rlsk: A job scheduler for federated kubernetes clusters based on reinforcement learning. In 2020 IEEE International Conference on Cloud Engineering (IC2E), pages 116-123.

Khaleghzadeh, H., Fahad, M., Shahid, A., Manumachu, R. R., and Lastovetsky, A. (2021). Bi-objective optimization of data-parallel applications on heterogeneous hpc platforms for performance and energy through workload distribution. IEEE Transactions on Parallel and Distributed Systems, 32(3):543-560.

Li, J. and Martinez, J. F. (2005). Power-performance considerations of parallel computing on chip multiprocessors. ACM Transactions on Architecture and Code Optimization (TACO), 2(4):397-422.

Liu, F., Tong, J., Mao, J., Bohn, R., Messina, J., Badger, L., and Leaf, D. (2012). NIST Cloud Computing Reference Architecture: Recommendations of the National Institute of Standards and Technology. CreateSpace Independent Publishing Platform, USA.

Lorenzon, A. F. and Beck Filho, A. C. S. (2019). Parallel computing hits the power wall: principles, challenges, and a survey of solutions. Springer Nature.

Maghsoud, Z., Noori, H., and Pour Mozaffari, S. (2021). Peps: Predictive energyefficient parallel scheduler for multi-core processors. The Journal of Supercomputing, 77(7):6566-6585.

Makrani, H. M., Sayadi, H., Motwani, D., Wang, H., Rafatirad, S., and Homayoun, H. (2018). Energy-aware and machine learning-based resource provisioning of inmemory analytics on cloud. In Proceedings of the ACM Symposium on Cloud Computing, pages 517-517.

Marques, S. M., Medeiros, T. S., Rossi, F. D., Luizelli, M. C., Beck, A. C. S., and Lorenzon, A. F. (2021). Synergically rebalancing parallel execution via dct and turbo boosting. In 2021 58th ACM/IEEE Design Automation Conference (DAC), pages 277-282. IEEE.

Márquez, G., Villegas, M. M., and Astudillo, H. (2018). A pattern language for scalable microservices-based systems. In ECSA, NY, USA. ACM.

Masanet, E., Shehabi, A., Lei, N., Smith, S., and Koomey, J. (2020). Recalibrating global data center energy-use estimates. Science, 367(6481):984-986.

Orhean, A. I., Pop, F., and Raicu, I. (2018). New scheduling approach using reinforcement learning for heterogeneous distributed systems. Journal of Parallel and Distributed Computing, 117:292-302.

Park, J. and Abraham, J. A. (2011). A fast, accurate and simple critical path monitor for improving energy-delay product in dvs systems. In IEEE/ACM International Symposium on Low Power Electronics and Design, pages 391-396. IEEE.

Schwarzrock, J., de Oliveira, C. C., Ritt, M., Lorenzon, A. F., and Beck, A. C. S. (2020). A runtime and non-intrusive approach to optimize edp by tuning threads and cpu frequency for openmp applications. IEEE Transactions on Parallel and Distributed Systems, 32(7):1713-1724.

Stratton, J., Rodrigues, C., Sung, I., Obeid, N., Chang, L., Anssari, N., Liu, G., and Hwu, W. (2012). Parboil: A revised benchmark suite for scientific and commercial throughput computing. Center for Reliable and High-Performance Computing.

Suleman, M. A., Qureshi, M. K., and Patt, Y. N. (2008). Feedback-driven threading: Power-efficient and high-performance execution of multi-threaded workloads on cmps. SIGARCH Comput. Archit. News, 36(1):277-286.

Takouna, I., Dawoud, W., and Meinel, C. (2012). Energy efficient scheduling of hpc-jobs on virtualize clusters using host and vm dynamic configuration. ACM SIGOPS Operating Systems Review, 46(2):19-27.

Thurgood, B. and Lennon, R. G. (2019). Cloud computing with kubernetes cluster elastic scaling. In ICFNDS, NY, USA. ACM.