On the Elasticity of Parallel Components in a Cloud of High Performance Computing Services

  • Francisco Carvalho Junior Universidade Federal do Ceará
  • João Marcelo Alencar Universidade Federal do Ceará

Abstract

Cloud computing offers virtually unlimited set of resources and flexibility to allocate them through elasticity. But cloud limitations, such as the complexity of configuration and environment dynamicity, may jeopardizes the assurance of QoS requirements. HPC Shelf is a cloud of HPC services that employs a component-oriented architecture to describe hardware and software resources of parallel computing systems. We design a framework for HPC Shelf that employ cloud elasticity concepts for keeping the values of QoS metrics of parallel computing systems inside an acceptable range, enabling adaptations to fulfill the QoS contract restrictions. In our evaluation, using a linear algebra application, we show how HPC Shelf takes advantage of cloud elasticity to reinforce QoS requirements, rectifying assumptions from ill-defined QoS models.

References

[1] F. H. de Carvalho Junior, J. C. Silva, and A. B. O. Dantas, “A Scientific Workflow Management System for Orchestration of Parallel Components in a Cloud of Large-Scale Parallel Processing Services,” Science of Computer Programming, vol. 173, pp. 95–127, Mar. 2019.

[2] L. C. McInnes, J. Ray, R. Armstrong, T. L. Dahlgren, A. Malony, B. Norris, S. Shende, J. P. Kenny, and J. Steensland, “Computational quality of service for scientific cca applications: Composition, substitution,” and reconfiguration. Technical Report ANL/MCS-P1326-0206, Argonne National Laboratory, Tech. Rep., 2006.

[3] F. Baude, D. Caromel, C. Dalmasso, M. Danelutto, V. Getov, L. Henrio, and C. Pérez, “Gcm: a grid extension to fractal for autonomous distributed components,” Annals of Telecommunications-annales des télécommunications, vol. 64, no. 1-2, pp. 5–24, 2009.

[4] A. Gupta, “Techniques for Efficient High Performance Computing in the Cloud,” 2014. [Online]. Available: http://hdl.handle.net/2142/50718

[5] G. Mateescu, W. Gentzsch, and C. J. Ribbens, “Hybrid Computing—Where HPC meets grid and Cloud Computing,” Future Generation Computer Systems, vol. 27, no. 5, pp. 440–453, May 2011. [Online]. Available: http://dx.doi.org/10.1016/j.future.2010.11.003

[6] M. Caballer, C. de Alfonso, F. Alvarruiz, and G. Moltó, “EC3: Elastic Cloud Computing Cluster,” Journal of Computer and System Sciences, vol. 79, no. 8, pp. 1341–1351, Dec. 2013. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0022000013001141

[7] G. Mencagli, M. Vanneschi, and E. Vespa, “Control-theoretic adaptation strategies for autonomic reconfigurable parallel applications on cloud environments,” in High Performance Computing and Simulation (HPCS), 2013 International Conference on, July 2013, pp. 11–18.

[8] A. Raveendran, T. Bicer, and G. Agrawal, “A Framework for Elastic Execution of Existing MPI Programs,” in Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on, May 2011, pp. 940–947.

[9] R. R. Righi, V. F. Rodrigues, C. A. da Costa, G. Galante, L. C. E. de Bona, and T. Ferreto, “AutoElastic: Automatic Resource Elasticity for High Performance Applications in the Cloud,” IEEE Transactions on Cloud Computing, vol. 4, no. 1, pp. 6–19, Jan 2016.

[10] R. da Rosa Righi, V. F. Rodrigues, G. Rostirolla, C. A. da Costa, E. Roloff, and P. O. A. Navaux, “A lightweight plug-and-play elasticity service for self-organizing resource provisioning on parallel applications,” Future Generation Computer Systems, pp. –, 2017. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0167739X17302339

[11] F. H. de Carvalho Junior and R. D. Lins, “Separation of Concerns for Improving Practice of Parallel Programming,” INFORMATION, An International Journal, vol. 8, no. 5, pp. 621–638, Sep. 2005.

[12] F. H. de Carvalho Junior, C. A. Rezende, J. C. Silva, W. G. Al Alam, and J. M. U. de Alencar, “Contextual Abstraction in a Type System for Component-Based High Performance Computing Platforms,” Science of Computer Programming, vol. 132, pp. 96–128, 2016.

[13] B. Norris, J. Ray, R. Armstrong, L. C. McInnes, D. E. Bernholdt, W. R. Elwasif, A. D. Malony, and S. Shende, Computational Quality of Service for Scientific Components. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 264–271. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-24774-6 23

[14] G. Marin and J. Mellor Crummey, “Cross-architecture Performance Predictions for Scientific Applications Using Parameterized Models,” in Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, ser. SIGMETRICS ’04/Performance ’04. New York, NY, USA: ACM, 2004, pp. 2–13. [Online]. Available: http://doi.acm.org/10.1145/1005686.1005691 [15] S. Lee, J. S. Meredith, and J. S. Vetter, “COMPASS: A Framework for Automated Performance Modeling and Prediction,” in Proceedings of the 29th ACM on International Conference on Supercomputing, ser. ICS. New York, NY, USA: ACM, 2015, pp. 405–414.

[16] E. Ipek, B. R. de Supinski, M. Schulz, and S. A. McKee, “An Approach to Performance Prediction for Parallel Applications,” in Proceedings of the 11th International Euro-Par Conference on Parallel Processing, ser. Euro-Par’05. Berlin, Heidelberg: Springer-Verlag, 2005, pp. 196–205. [Online]. Available: http://dx.doi.org/10.1007/11549468 24

[17] S. Salaria, K. Brown, H. Jitsumoto, and S. Matsuoka, “Evaluation of HPC-Big Data Applications Using Cloud Platforms,” in Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, ser. CCGrid’17. Piscataway, NJ, USA: IEEE Press, 2017, pp. 1053–1061. [Online]. Available: https://doi.org/10.1109/CCGRID.2017.143

[18] M. Gilani, C. Inibhunu, and Q. H. Mahmoud, “Application and Network Performance of Amazon Elastic Compute Cloud Instances,” in 2015 IEEE 4th International Conference on Cloud Networking (CloudNet), Oct. 2015, pp. 315–318.

[19] X. Wen, G. Gu, Q. Li, Y. Gao, and X. Zhang, “Comparison of open-source cloud management platforms: Openstack and opennebula,” in Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on. IEEE, 2012, pp. 2457–2461.

[20] D. G. Feitelson and L. Rudolph, “Toward Convergence in Job Schedulers for Parallel Supercomputers,” in Lecture Notes In Computer Science, vol. 1162 (Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP’96), D. G. Feitelson and L. Rudolph, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 1996, pp. 1–26.

[21] J. O. Kephart and D. M. Chess, “The Vision of Autonomic Computing,” Computer, vol. 36, no. 1, pp. 41–50, Jan. 2003.
Published
2019-11-08
How to Cite
JUNIOR, Francisco Carvalho; ALENCAR, João Marcelo. On the Elasticity of Parallel Components in a Cloud of High Performance Computing Services. Proceedings of the Symposium on High Performance Computing Systems (SSCAD), [S.l.], p. 181-192, nov. 2019. ISSN 0000-0000. Available at: <https://sol.sbc.org.br/index.php/sscad/article/view/8667>. Date accessed: 18 may 2024. doi: https://doi.org/10.5753/wscad.2019.8667.