Proposta de Suporte à Parametrização no NPB com CUDA
Este trabalho propõe a introdução de parâmetros configuráveis para GPUs no NPB. A etapa inicial do estudo contemplou a parametrização do número de threads por bloco e seu impacto no desempenho de GPUs.Referências
Bailey, D. H., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Fatoohi, R. A., Frederickson, P. O., Lasinski, T. A., Simon, H. D., Venkatakrishnan, V., and Weeratunga, S. K. (1994). The NAS Parallel Benchmarks RNR-94-007. Technical report, NASA Advanced Supercomputing Division.
d. Araujo, G. A., Griebler, D., Danelutto, M., and Fernandes, L. G. (2020). Efcient NAS Parallel Benchmark Kernels with CUDA. In 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pages 9– 16.
Do, Y., Kim, H., Oh, P., Park, D., and Lee, J. (2019). SNU-NPB 2019: Parallelizing and Optimizing NPB in OpenCL and CUDA for Modern GPUs. In 2019 IEEE International Symposium on Workload Characterization (IISWC), pages 93–105.
Griebler, D., Loff, J., Mencagli, G., Danelutto, M., and Fernandes, L. G. (2018). Efcient NAS Benchmark Kernels with C++ Parallel Programming. In 26th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), PDP’18, pages 733–740, Cambridge, UK. IEEE.
Seo, S., Jo, G., and Lee, J. (2011). Performance Characterization of the NAS Parallel Benchmarks in OpenCL. In 2011 IEEE International Symposium on Workload Characterization (IISWC), pages 137–148.
Xu, R., Tian, X., Chandrasekaran, S., Yan, Y., and Chapman, B. (2015). NAS Parallel Benchmarks for GPGPUs Using a Directive-Based Programming Model. In Brodman, J. and Tu, P., editors, Languages and Compilers for Parallel Computing, pages 67–81, Cham. Springer International Publishing.
d. Araujo, G. A., Griebler, D., Danelutto, M., and Fernandes, L. G. (2020). Efcient NAS Parallel Benchmark Kernels with CUDA. In 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pages 9– 16.
Do, Y., Kim, H., Oh, P., Park, D., and Lee, J. (2019). SNU-NPB 2019: Parallelizing and Optimizing NPB in OpenCL and CUDA for Modern GPUs. In 2019 IEEE International Symposium on Workload Characterization (IISWC), pages 93–105.
Griebler, D., Loff, J., Mencagli, G., Danelutto, M., and Fernandes, L. G. (2018). Efcient NAS Benchmark Kernels with C++ Parallel Programming. In 26th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), PDP’18, pages 733–740, Cambridge, UK. IEEE.
Seo, S., Jo, G., and Lee, J. (2011). Performance Characterization of the NAS Parallel Benchmarks in OpenCL. In 2011 IEEE International Symposium on Workload Characterization (IISWC), pages 137–148.
Xu, R., Tian, X., Chandrasekaran, S., Yan, Y., and Chapman, B. (2015). NAS Parallel Benchmarks for GPGPUs Using a Directive-Based Programming Model. In Brodman, J. and Tu, P., editors, Languages and Compilers for Parallel Computing, pages 67–81, Cham. Springer International Publishing.
Como Citar
ARAUJO, Gabriell; GRIEBLER, Dalvan; FERNANDES, Luiz G..
Proposta de Suporte à Parametrização no NPB com CUDA. In: ESCOLA REGIONAL DE ALTO DESEMPENHO DA REGIÃO SUL (ERAD-RS), 21. , 2021, Evento Online.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
p. 103-104.
ISSN 2595-4164.