CUDA implementation of NPB Kernels

  • Gabriell Alves de Araújo PUCRS
  • Dalvan Griebler PUCRS/SETREM
  • Luiz Gustavo Leão Fernandes GMAP - PPGCC - PUCRS

Abstract


NAS Parallel Benchmarks (NPB) is a set of benchmarks used to evaluate hardware and software, which over the years has been ported to different frameworks. Concerning GPUs, currently there are only OpenCL and OpenACC versions. This paper contributes to the literature providing the first complete implementation of NPB CUDA kernels, experimenting with unprecedented workload and revealing new facts about the NPB.

Keywords: Algorithms Parallel and Distributed, Specific architectures and Dedicated (GPUs, FPGAs, and others), Evaluation, Performance Measurement and Prediction, Languages, Compilers and Tools Parallel and Distributed Computing, Techniques and Extraction Methods Parallelism

References

Bailey, D. H., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Fatoohi, R. A., Frederickson, P. O., Lasinski, T. A., Simon, H. D., Venkatakrishnan, V., and Weeratunga, S. K. (1994). The NAS Parallel Benchmarks RNR-94-007. Technical report, NASA Advanced Supercomputing Division.

Griebler, D., Loff, J., Mencagli, G., Danelutto, M., and Fernandes, L. G. (2018). Efficient NAS Benchmark Kernels with C++ Parallel Programming. In 26th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), PDP’18, pages 733–740, Cambridge, UK. IEEE.

Seo, S., Jo, G., and Lee, J. (2011). Performance Characterization of the NAS Parallel Benchmarks in OpenCL. In 2011 IEEE International Symposium on Workload Characterization (IISWC), pages 137–148.

Tian, X., Xu, R., Yan, Y., Chandrasekaran, S., Eachempati, D., and Chapman, B. (2016). Compiler Transformation of Nested Loops for General Purpose GPUs. Concurrency and Computation: Practice and Experience, 28(2):537–556.

Xu, R., Tian, X., Chandrasekaran, S., Yan, Y., and Chapman, B. (2015). NAS Parallel Benchmarks for GPGPUs Using a Directive-Based Programming Model. In Brodman,

J. and Tu, P., editors, Languages and Compilers for Parallel Computing, pages 67–81, Cham. Springer International Publishing.
Published
2020-04-15
DE ARAÚJO, Gabriell Alves; GRIEBLER, Dalvan; FERNANDES, Luiz Gustavo Leão. CUDA implementation of NPB Kernels. In: REGIONAL SCHOOL OF HIGH PERFORMANCE COMPUTING FROM SOUTHERN BRAZIL (ERAD-RS), 20. , 2020, Santa Maria. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 85-88. ISSN 2595-4164. DOI: https://doi.org/10.5753/eradrs.2020.10762.