Escalonamento com Consciência Energética para Fluxos de Trabalho Científicos sem Servidor: Uma Abordagem de Aprendizado de Máquina

Lucas Rosa; Alfredo Goldman

doi:10.5753/eradsp.2024.239934

Lucas Rosa USP
Alfredo Goldman USP

DOI: https://doi.org/10.5753/eradsp.2024.239934

Resumo

Este artigo propõe abordar os desafios de eficiência energética e escalonamento de fluxos de trabalho científicos em ambientes de computação sem servidor. Integrando técnicas de aprendizado de máquina e simulação, a pesquisa visa preencher lacunas entre eficiência energética e escalonamento sem servidor. A metodologia inclui coleta de dados históricos, previsão de consumo energético por meio de aprendizado de máquina e desenvolvimento de políticas de escalonamento com redes neurais profundas. O projeto também envolve adaptação de sistemas de gerenciamento de fluxo de trabalho e validação em ambientes reais, visando oferecer soluções viáveis para os desafios atuais em HPC.

Referências

Antici, F., Yamamoto, K., Domke, J., and Kiziltan, Z. (2023). Augmenting ML-based Predictive Modelling with NLP to Forecast a Job’s Power Consumption. In Proceedings of the SC ’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, pages 1820–1830, Denver CO USA. ACM.

Elshamy, A., Alquraan, A., and Al-Kiswany, S. (2023). A Study of Orchestration Approaches for Scientific Workflows in Serverless Computing. In Proceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies, pages 34–40, Rome Italy. ACM.

Koslovski, G. P., Pereira, K., and Albuquerque, P. R. (2024). DAG-based workflows scheduling using Actor–Critic Deep Reinforcement Learning. Future Generation Computer Systems, 150:354–363.

Lehmann, F., Bader, J., Tschirpke, F., Thamsen, L., and Leser, U. (2023). How Workflow Engines Should Talk to Resource Managers: A Proposal for a Common Workflow Scheduling Interface. In 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid), pages 166–179. arXiv:2302.07652 [cs].

Liao, X.-k., Lu, K., Yang, C.-q., Li, J.-w., Yuan, Y., Lai, M.-c., Huang, L.-b., Lu, P.-j., Fang, J.-b., Ren, J., and Shen, J. (2018). Moving from exascale to zettascale computing: challenges and techniques. Frontiers of Information Technology & Electronic Engineering, 19(10):1236–1244.

Majid, A. Y. and Marin, E. (2023). A Review of Deep Reinforcement Learning in Serverless Computing: Function Scheduling and Resource Auto-Scaling.

Roy, R. B., Patel, T., Gadepally, V., and Tiwari, D. (2022a). Mashup: making serverless computing useful for HPC workflows via hybrid execution. In Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 46–60, Seoul Republic of Korea. ACM.

Roy, R. B., Patel, T., and Tiwari, D. (2022b). DayDream: Executing Dynamic Scientific Workflows on Serverless Platforms with Hot Starts. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–18, Dallas, TX, USA. IEEE.

Su, L. and Naffziger, S. (2023). 1.1 Innovation For the Next Decade of Compute Efficiency. In 2023 IEEE International Solid-State Circuits Conference (ISSCC), pages 8–12, San Francisco, CA, USA. IEEE.

Zhang, D., Dai, D., He, Y., Bao, F. S., and Xie, B. (2020). RLScheduler: an automated HPC batch job scheduler using reinforcement learning. In Proceedings of the international conference for high performance computing, networking, storage and analysis, Sc ’20. IEEE Press.

Escalonamento com Consciência Energética para Fluxos de Trabalho Científicos sem Servidor: Uma Abordagem de Aprendizado de Máquina

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)