Monitoring HPC applications on the cloud with Zabbix

Abstract


When a user overprovision cloud resources for his or her HPC application, it leads to underutilization and causes resources wastage. One solution is to monitor their usage. In this work, we propose a viable architecture to monitor HPC applications on cloud environment using the Zabbix tool. With this, we will be able to minimize cost and develop a methodology to monitor different kinds of HPC applications.

Keywords: cloud computing, resource monitoring, high performance computing, computing resources, cost optimization

References

Aceto, G., Botta, A., de Donato, W., and Pescape, A. (2013). Cloud monitoring: A survey. Computer Networks, 57(9).

Al-Ayyoub, M., Jararweh, Y., Daraghmeh, M., and Althebyan, Q. (2015). Multi-agent based dynamic resource provisioning and monitoring for cloud computing systems infrastructure. Cluster Computing, 18(2).

Amazon (2009). Amazon cloudwatch. https://aws.amazon.com/pt/cloudwatch/.
Brandt, J., Gentile, A., Mayo, J., Pebay, P., Roe, D., Thompson, D., and Wong, M. (2009).

Resource monitoring and management with OVIS to enable HPC in cloud computing environments. In IEEE International Symposium on Parallel Distributed Processing.

Chapel, J. (2019). Cloud waste to hit over $14 billion in 2019. https://devops.com/cloud-waste-to-hit-over-14-billion-in-2019/.

De Chaves, S. A., Uriarte, R. B., and Westphall, C. B. (2011). Toward an architecture for monitoring private clouds. IEEE Communications Magazine, 49(12).

Dhingra, M., Lakshmi, J., and Nandy, S. K. (2012). Resource usage monitoring in clouds. In 13th International Conference on Grid Computing.

Fatema, K., Emeakaroha, V. C., Healy, P. D., Morrison, J. P., and Lynn, T. (2014). A survey of cloud monitoring tools: Taxonomy, capabilities and objectives. Journal of Parallel and Distributed Computing, 74(10).

Gutierrez-Aguado, J., Alcaraz Calero, J. M., and Diaz Villanueva, W. (2016). IaaSMon: Monitoring architecture for public cloud computing data centers. J. Grid Comput., 14.

Irimie, B.-C. and Petcu, D. (2015). Scalable and fault tolerant monitoring of security parameters in the cloud. In 17th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

LLC, Z. (2001). Zabbix. www.zabbix.com/.

Nagios (2018). Nagios Core 4 Documentation.

ParkMyCloud (2015). Parkmycloud. parkmycloud.com/.

Perez-Espinoza, J. A., Sosa-Sosa, V. J., and Gonzalez, J. L. (2015). Distribution and load balancing strategies in private cloud monitoring. In 12th International Conference on Electrical Engineering, Computing Science and Automatic Control.

Sotomayor, B., Montero, R., Llorente, I., and Foster, I. (2009). Virtual infrastructure management in private and hybrid clouds. Internet Computing, IEEE, 13.

Xu, X., Chen, Y., and Alcaraz Calero, J. M. (2017). Distributed decentralized collaborative monitoring architecture for cloud infrastructures. Cluster Computing, 20(3).
Published
2020-08-19
TAVARES, William F. C.; ASSIS, Marcio Roberto Miranda; BORIN, Edson. Monitoring HPC applications on the cloud with Zabbix. In: REGIONAL SCHOOL OF HIGH PERFORMANCE COMPUTING FROM SÃO PAULO (ERAD-SP), 11. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 70-73. DOI: https://doi.org/10.5753/eradsp.2020.16889.

Most read articles by the same author(s)

1 2 3 4 > >>