Exploring Simplicity and Efficiency: Regression-based Scheduling Heuristics in HPC

  • Lucas Rosa USP
  • Danilo Carastan-Santos CNRS / Inria / Grenoble INP / Univ. Grenoble Alpes
  • Alfredo Goldman USP

Resumo


This research examines the interplay between resource management in high-performance computing systems and the application of machine learning techniques in developing scheduling heuristics. The potential for improved performance, through scheduling heuristics based on linear regression and polynomial job characteristics, was explored. Larger polynomials caused instability due to multicollinearity effects, but the simplest polynomial delivered stable and efficient scheduling performance. The study also evaluates the longterm resilience of these regression-based heuristics.

Referências

Carastan-Santos, D., De Camargo, R. Y., Trystram, D., and Zrigui, S. (2019). One Can Only Gain by Replacing EASY Backfilling: A Simple Scheduling Policies Case Study. In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pages 1–10, Larnaca, Cyprus. IEEE.

Dutot, P.-F., Saule, E., Srivastav, A., and Trystram, D. (2016). Online Non-preemptive Scheduling to Optimize Max Stretch on a Single Machine. In Dinh, T. N. and Thai, M. T., editors, Computing and Combinatorics, volume 9797, pages 483–495. Springer International Publishing, Cham.

Feitelson, D. G., Tsafrir, D., and Krakov, D. (2014). Experience with using the Parallel Workloads Archive. Journal of Parallel and Distributed Computing, 74(10):2967–2982.

García García, C., Salmerón Gómez, R., and García Pérez, J. (2022). A review of ridge parameter selection: Minimization of the mean squared error vs. mitigation of multicollinearity. Communications in Statistics Simulation and Computation, pages 1–13.

Lucarelli, G., Moseley, B., Thang, N. K., Srivastav, A., and Trystram, D. (2018). Online Non-preemptive Scheduling on Unrelated Machines with Rejections. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, pages 291–300, Vienna Austria. ACM.

Yoo, A. B., Jette, M. A., and Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. In Goos, G., Hartmanis, J., Van Leeuwen, J., Feitelson, D., Rudolph, L., and Schwiegelshohn, U., editors, Job Scheduling Strategies for Parallel Processing, volume 2862, pages 44–60. Springer Berlin Heidelberg, Berlin, Heidelberg.
Publicado
17/07/2023
ROSA, Lucas; CARASTAN-SANTOS, Danilo; GOLDMAN, Alfredo. Exploring Simplicity and Efficiency: Regression-based Scheduling Heuristics in HPC. In: ESCOLA REGIONAL DE ALTO DESEMPENHO DE SÃO PAULO (ERAD-SP), 14. , 2023, São José dos Campos/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 41-44. DOI: https://doi.org/10.5753/eradsp.2023.232635.

Artigos mais lidos do(s) mesmo(s) autor(es)

1 2 3 > >>