Escalonamento Dinâmico Eficiente em Arquiteturas Híbridas
Resumo
Aplicações que lidam com grandes quantidades de dados em tempo aceitável vem impulsionando o desenvolvimento de novas arquiteturas compostas por diferentes unidades de processamento (UP). Ambientes de execução vem sendo propostos para explorar esses recursos, oferecendo métodos capazes de escalonar tarefas entre diferentes UPs. Embora a maioria das aplicações sejam heterogêneas (tarefas com características distintas), as técnicas atuais focam nessas características de forma isolada, gerando execuções ineficientes. Neste trabalho apresentamos duas novas estratégias de escalonamento, combinando diferentes estratégias, capazes de generalizar em diferentes cenários, sendo até 20% mais eficientes que as técnicas atuais.Referências
J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn, , and T. J. Purcell., “A survey of general-purpose computation on graphics hardware. computer graphics,” vol. 26, pp. 80–113, Mar. 2007.
M. Fatica and D. Luebke, “High performance computing with CUDA,” Supercomputing 2007 tutorial. In Supercomputing 2007 tutorial notes, November 2007.
C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, “Starpu: A unified platform for task scheduling on heterogeneous multicore architectures,” Concurrency and Computation: Practice and Experience and Special Issue: Euro-Par 2009, vol. 23, pp. 187–198.
R. Ferreira,W. Meira Jr, D. Guedes, L. Drummond, B. Coutinho, G. Teodoro, T. Tavares, R. Araujo, and G. Ferreira, “Anthill: A scalable run-time environment for data mining applications,” in SBAC-PAD 2005, pp. 159–166.
G. F. Diamos and S. Yalamanchili, “Harmony: an execution model and runtime for heterogeneous many core systems,” HPDC 08: Proceedings of the 17th international symposium on High performance distributed computing, pp. 197– 200, 2008.
P. Bellens, J. M. Perez, R. M. Badia, and J. Labarta, “Cellss: a programming model for the cell be architecture,” Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 86, 2006.
C. Augonnet, S. Thibault, R. Namyst, and M. Nijhuis, “Exploiting theCell/BE architecture with the StarPU unified runtime system,” Jul. 2009.
G. Teodoro, T. D. R. Hartley, Ü. V. Çatalyürek, and R. Ferreira, “Optimizing dataflow applications on heterogeneous environments,” Cluster Computing, vol. 15, no. 2, pp. 125– 144, 2012.
Y.-K. Kwok and I. Ahmad, “Static scheduling algorithms for allocating directed task graphs to multiprocessors,” ACM Comput. Surv., vol. 31, no. 4, pp. 406–471, Dec. 1999.
D. G. Amalarethinam and G. J. Mary, “Article: A new dag based dynamic task scheduling algorithm (dytas) for multiprocessor systems,” International Journal of Computer Applications, vol. 19, no. 8, pp. 24–28, April 2011, published by Foundation of Computer Science.
C. Augonnet, S. Thibault, and R. Namyst, “StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines,” INRIA, Rapport de recherche RR-7240, Mar. 2010. [Online]. Available: http://hal.inria.fr/inria-00467677
W. Wang and G. Zeng, “Trusted dynamic scheduling for large-scale parallel distributed systems,” in Parallel Processing Workshops (ICPPW), 2011 40th International Conference on, sept. 2011, pp. 137 –144.
R. D. Blumofe and C. E. Leiserson, “Scheduling multithreaded computations by work stealing,” J. ACM, vol. 46, no. 5, pp. 720–748, Sep. 1999. [Online]. Available: http://doi.acm.org/10.1145/324133.324234
W. Smith, V. Taylor, and I. Foster, “Using run-time predictions to estimate queue wait times and improve scheduler performance,” in Scheduling Strategies for Parallel Processing. Springer-Verlag, 1999, pp. 202–219.
G. Teodoro, R. Sachetto, O. Sertel, M. Gurcan, W. M. Jr., U. Catalyurek, , and R. Ferreira, “Coordinating the use of gpu and cpu for improving performance of compute intensive applications,” IEEE International Conference on Cluster Computing, Sep. 2009.
A. Jooya, A. Baniasadi, and M. Analoui, “Historyaware, resource-based dynamic scheduling for heterogeneous multi-core processors,” Computers Digital Techniques, IET, vol. 5, no. 4, pp. 254 –262, july 2011.
G. Andrade., M. Mendonça, R. Sachetto., D. Madeira, and L. Rocha, “Escalonamento em arquiteturas heterogeneas: Um estudo comparativo,” XXXII Congresso da Sociedade Brasileira de Computação, Brasil, 2012.
M. Fatica and D. Luebke, “High performance computing with CUDA,” Supercomputing 2007 tutorial. In Supercomputing 2007 tutorial notes, November 2007.
C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, “Starpu: A unified platform for task scheduling on heterogeneous multicore architectures,” Concurrency and Computation: Practice and Experience and Special Issue: Euro-Par 2009, vol. 23, pp. 187–198.
R. Ferreira,W. Meira Jr, D. Guedes, L. Drummond, B. Coutinho, G. Teodoro, T. Tavares, R. Araujo, and G. Ferreira, “Anthill: A scalable run-time environment for data mining applications,” in SBAC-PAD 2005, pp. 159–166.
G. F. Diamos and S. Yalamanchili, “Harmony: an execution model and runtime for heterogeneous many core systems,” HPDC 08: Proceedings of the 17th international symposium on High performance distributed computing, pp. 197– 200, 2008.
P. Bellens, J. M. Perez, R. M. Badia, and J. Labarta, “Cellss: a programming model for the cell be architecture,” Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 86, 2006.
C. Augonnet, S. Thibault, R. Namyst, and M. Nijhuis, “Exploiting theCell/BE architecture with the StarPU unified runtime system,” Jul. 2009.
G. Teodoro, T. D. R. Hartley, Ü. V. Çatalyürek, and R. Ferreira, “Optimizing dataflow applications on heterogeneous environments,” Cluster Computing, vol. 15, no. 2, pp. 125– 144, 2012.
Y.-K. Kwok and I. Ahmad, “Static scheduling algorithms for allocating directed task graphs to multiprocessors,” ACM Comput. Surv., vol. 31, no. 4, pp. 406–471, Dec. 1999.
D. G. Amalarethinam and G. J. Mary, “Article: A new dag based dynamic task scheduling algorithm (dytas) for multiprocessor systems,” International Journal of Computer Applications, vol. 19, no. 8, pp. 24–28, April 2011, published by Foundation of Computer Science.
C. Augonnet, S. Thibault, and R. Namyst, “StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines,” INRIA, Rapport de recherche RR-7240, Mar. 2010. [Online]. Available: http://hal.inria.fr/inria-00467677
W. Wang and G. Zeng, “Trusted dynamic scheduling for large-scale parallel distributed systems,” in Parallel Processing Workshops (ICPPW), 2011 40th International Conference on, sept. 2011, pp. 137 –144.
R. D. Blumofe and C. E. Leiserson, “Scheduling multithreaded computations by work stealing,” J. ACM, vol. 46, no. 5, pp. 720–748, Sep. 1999. [Online]. Available: http://doi.acm.org/10.1145/324133.324234
W. Smith, V. Taylor, and I. Foster, “Using run-time predictions to estimate queue wait times and improve scheduler performance,” in Scheduling Strategies for Parallel Processing. Springer-Verlag, 1999, pp. 202–219.
G. Teodoro, R. Sachetto, O. Sertel, M. Gurcan, W. M. Jr., U. Catalyurek, , and R. Ferreira, “Coordinating the use of gpu and cpu for improving performance of compute intensive applications,” IEEE International Conference on Cluster Computing, Sep. 2009.
A. Jooya, A. Baniasadi, and M. Analoui, “Historyaware, resource-based dynamic scheduling for heterogeneous multi-core processors,” Computers Digital Techniques, IET, vol. 5, no. 4, pp. 254 –262, july 2011.
G. Andrade., M. Mendonça, R. Sachetto., D. Madeira, and L. Rocha, “Escalonamento em arquiteturas heterogeneas: Um estudo comparativo,” XXXII Congresso da Sociedade Brasileira de Computação, Brasil, 2012.
Publicado
23/10/2013
Como Citar
ANDRADE, Guilherme; FERREIRA, Renato; RAMOS, Gabriel; SACHETTO, Rafael; MADEIRA, Daniel; ROCHA, Leonardo.
Escalonamento Dinâmico Eficiente em Arquiteturas Híbridas. In: SIMPÓSIO EM SISTEMAS COMPUTACIONAIS DE ALTO DESEMPENHO (SSCAD), 14. , 2013, Porto de Galinhas.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2013
.
p. 144-151.
DOI: https://doi.org/10.5753/wscad.2013.16784.