Preditor de Desempenho de GPUs aplicado à Exploração do Espaço de Projetos ciente de Dark Silicon
Resumo
Simuladores de sistemas heterogêneos GP-GPU procuram oferecer acurácia de desempenho ao custo de elevado tempo de execução. Com o objetivo de evitar o custoso processo de simulação durante as etapas de exploração arquitetural de sistemas baseados em GPUs, desenvolvemos e avaliamos diversos preditores de desempenho de GPUs baseados em algoritmos de aprendizado de máquina com acurácia e baixo custo computacional. A qualidade dos preditores desenvolvidos neste trabalho foi avaliada por meio de métricas como coeficiente de determinação, score de treinamento e validação cruzada. Preditores baseados nas técnicas de Random Forest e SVR apresentaram os melhores resultados tanto em acurácia quanto performance.Referências
Bakhoda, A., Yuan, G. L., Fung, W. W., Wong, H., and Aamodt, T. M. (2009). Analyzing cuda workloads using a detailed gpu simulator. In 2009 IEEE International Symposium on Performance Analysis of Systems and Software, pages 163–174. IEEE.
Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32.
de Bastos, F. A. C. (2007). Classicador de modulaçoes baseado em máquinas de vetores de suporte. PhD thesis, UNIVERSIDADE FEDERAL DO RIO DE JANEIRO.
Dennard, R. H., Gaensslen, F. H., Yu, H.-N., Rideout, V. L., Bassous, E., and LeBlanc, A. R. (1974). Design of ion-implanted mosfet's with very small physical dimensions. IEEE Journal of Solid-State Circuits, 9(5):256–268.
Fung, W. W., Sham, I., Yuan, G., and Aamodt, T. M. (2007). Dynamic warp formation and scheduling for efcient gpu control ow. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007), pages 407–420. IEEE.
Gupta, U., Campbell, J., Ogras, U. Y., Ayoub, R., Kishinevsky, M., Paterna, F., and Gu- mussoy, S. (2016). Adaptive performance prediction for integrated gpus. In Proceed- ings of the 35th International Conference on Computer-Aided Design, pages 1–8.
Harrington, P. (2012). Machine learning in action, volume 5. Manning Greenwich, CT.
Jia, W., Shaw, K. A., and Martonosi, M. (2012). Stargazer: Automated regression-based gpu design space exploration. In 2012 IEEE International Symposium on Performance Analysis of Systems & Software, pages 2–13. IEEE.
Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N. S., Aamodt, T. M., and Reddi, V. J. (2013). Gpuwattch: enabling energy optimizations in gpgpus. ACM SIGARCH Computer Architecture News, 41(3):487–498.
Li, S., Ahn, J., Strong, R., Brockman, J., Tullsen, D., and Jouppi, N. (2013). The McPAT framework for multicore and manycore architectures: Simultaneously modeling power, area, and timing. ACM Transactions on Architecture and Code Optimization (TACO), 10(1):5.
Mooney, C. Z. (1997). Monte carlo simulation, volume 116. Sage publications.
Morris, G. W. and Aubury, M. (2007). Design space exploration of the european op- tion benchmark using hyperstreams. In 2007 International Conference on Field Pro- grammable Logic and Applications, pages 5–10. IEEE.
Parsian, M. (2015). Data algorithms: recipes for scaling up with Hadoop and Spark. " O'Reilly Media, Inc.".
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Santos, M., Sonohata, R., Krebs, C., Segovia, D., Santos, R., and Duenha, L. (2019a). Performance models for heterogeneous systems applied to the dark silicon-aware de- sign space exploration. Proccedings of the 31st International Symposium on Computer Architecture and High Performance Computing.
Santos, R., Duenha, L., Silva, A. C., Sousa, M., Tedesco, L. A., Melgarejo, J. C., Santos, T., Azevedo, R., and Moreno, E. (2017). Dark-silicon aware design space exploration. Journal of Parallel and Distributed Computing.
Santos, R., Sonohata, R., Krebs, C., Catelan, D., Duenha, L., Segovia, D., and Santos, M. T. (2019b). Exploração do projeto de sistemas baseados em gpu ciente de dark In Anais Principais do XX Simpósio em Sistemas Computacionais de Alto silicon. Desempenho, pages 358–369. SBC.
Taylor, M. B. (2012). Is dark silicon useful? harnessing the four horsemen of the coming dark silicon apocalypse. In Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, pages 1131–1136. IEEE.
Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32.
de Bastos, F. A. C. (2007). Classicador de modulaçoes baseado em máquinas de vetores de suporte. PhD thesis, UNIVERSIDADE FEDERAL DO RIO DE JANEIRO.
Dennard, R. H., Gaensslen, F. H., Yu, H.-N., Rideout, V. L., Bassous, E., and LeBlanc, A. R. (1974). Design of ion-implanted mosfet's with very small physical dimensions. IEEE Journal of Solid-State Circuits, 9(5):256–268.
Fung, W. W., Sham, I., Yuan, G., and Aamodt, T. M. (2007). Dynamic warp formation and scheduling for efcient gpu control ow. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007), pages 407–420. IEEE.
Gupta, U., Campbell, J., Ogras, U. Y., Ayoub, R., Kishinevsky, M., Paterna, F., and Gu- mussoy, S. (2016). Adaptive performance prediction for integrated gpus. In Proceed- ings of the 35th International Conference on Computer-Aided Design, pages 1–8.
Harrington, P. (2012). Machine learning in action, volume 5. Manning Greenwich, CT.
Jia, W., Shaw, K. A., and Martonosi, M. (2012). Stargazer: Automated regression-based gpu design space exploration. In 2012 IEEE International Symposium on Performance Analysis of Systems & Software, pages 2–13. IEEE.
Leng, J., Hetherington, T., ElTantawy, A., Gilani, S., Kim, N. S., Aamodt, T. M., and Reddi, V. J. (2013). Gpuwattch: enabling energy optimizations in gpgpus. ACM SIGARCH Computer Architecture News, 41(3):487–498.
Li, S., Ahn, J., Strong, R., Brockman, J., Tullsen, D., and Jouppi, N. (2013). The McPAT framework for multicore and manycore architectures: Simultaneously modeling power, area, and timing. ACM Transactions on Architecture and Code Optimization (TACO), 10(1):5.
Mooney, C. Z. (1997). Monte carlo simulation, volume 116. Sage publications.
Morris, G. W. and Aubury, M. (2007). Design space exploration of the european op- tion benchmark using hyperstreams. In 2007 International Conference on Field Pro- grammable Logic and Applications, pages 5–10. IEEE.
Parsian, M. (2015). Data algorithms: recipes for scaling up with Hadoop and Spark. " O'Reilly Media, Inc.".
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Santos, M., Sonohata, R., Krebs, C., Segovia, D., Santos, R., and Duenha, L. (2019a). Performance models for heterogeneous systems applied to the dark silicon-aware de- sign space exploration. Proccedings of the 31st International Symposium on Computer Architecture and High Performance Computing.
Santos, R., Duenha, L., Silva, A. C., Sousa, M., Tedesco, L. A., Melgarejo, J. C., Santos, T., Azevedo, R., and Moreno, E. (2017). Dark-silicon aware design space exploration. Journal of Parallel and Distributed Computing.
Santos, R., Sonohata, R., Krebs, C., Catelan, D., Duenha, L., Segovia, D., and Santos, M. T. (2019b). Exploração do projeto de sistemas baseados em gpu ciente de dark In Anais Principais do XX Simpósio em Sistemas Computacionais de Alto silicon. Desempenho, pages 358–369. SBC.
Taylor, M. B. (2012). Is dark silicon useful? harnessing the four horsemen of the coming dark silicon apocalypse. In Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, pages 1131–1136. IEEE.
Publicado
21/10/2020
Como Citar
DUENHA, Liana; SONOHATA, Rhayssa; ARIGONI, Danillo; SANTOS, Ricardo.
Preditor de Desempenho de GPUs aplicado à Exploração do Espaço de Projetos ciente de Dark Silicon. In: SIMPÓSIO EM SISTEMAS COMPUTACIONAIS DE ALTO DESEMPENHO (SSCAD), 21. , 2020, Online.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2020
.
p. 299-310.
DOI: https://doi.org/10.5753/wscad.2020.14078.