Comparação e Análise de Desempenho de Aceleradores Gráficos no Processamento de Matrizes

  • Nielsen Gonçalves UFPA
  • Carlos Costa UFPA
  • Josivaldo Araújo UFPA
  • Jessé Costa UFPA
  • Jairo Panetta UFPA

Abstract


In the last years the traditional solutions of High Performance Computing (HPC), such as the insertion or replacement of processors, have undergone major changes with the addition of new features. The use of graphics accelerators have been one of the features by which it has been possible to continue to expand the computational performance. However, as other techniques, this leads to the need for specific programming skills as well that facilitate the extraction of computational power offered by all CPU and GPU set. This paper makes a comparison between technologies like OpenACC, CUDA and OpenMP in performance evaluation for matrix processing.

References

Abdelkhalek, R., Calandra, H., Coulaud, O., Roman, J., and Latu, G. (2009). Fast seismic modeling and reverse time migration on a gpu cluster. In Smari, W. W. and McIntire, J. P., editors, HPCS, pages 36–43. IEEE.

Barbosa, J. (2011). Noções sobre Matrizes e Sistemas de Equações Lineares. FEUP Edições, 2a edição edition.

Bényász, G. and Cser, L. (2010). Clustering financial time series on cuda. In Conference of PHD Students in Computer Science Institute of Informatics of the University of Szeged, Hungary. University of Szeged.

Chandra, R., Dagum, L., Kohr, D., Maydan, D., McDonald, J., and Menon, R. (2001). Parallel Programming in OpenMP. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

Chapra, S. and Canale, R. (2011). Métodos Numéricos para Engenharia. McGraw Hill Brasil.

Cook, S. (2013). CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs. Applications of GPU Computing Series. Morgan Kaufmann.

Erlangga, Y. A. (2005). A robust and efficient iterative method for the numerical solution of the Helmholtz equation. fl.126, Delft University. Tese (Doutorado).

Fernandes, A. d. A., Stephany, S., and Panetta, J. (2012). Paralelização do modelo de cinética química atmosférica do cptec/inpe para utilizar placas gráficas. In XII Workshop de Computação Aplicada.

Gilat, A. and Subramaniam, V. (2008). Metodos Numéricos para Engenheiros e Cientistas: Uma Introdução com Aplicações Usando o MATLAB. Bookman.

Hoshino, T., Maruyama, N., Matsuoka, S., and Takaki, R. (2013). Cuda vs openacc: Performance case studies with kernel benchmarks and a memory-bound cfd application. In Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on, pages 136–143.

Kirk, D. B. and Hwu, W.-m. W. (2013). Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2st edition.

Ledur, C. L., Zeve, C. M., and dos Anjos, J. C. (2013). Comparative analysis of openacc, openmp and cuda using sequential and parallel algorithms.

Manavski, S. and Valle, G. (2008). Cuda compatible gpu cards as efficient hardware accelerators for smith-waterman sequence alignment. BMC Bioinformatics, 9:S10.

Manfroi, L. L. F., Schulze, B., Pinto, R. C. G., Mury, A. R., and Ferro, M. (2014). Avaliação de arquiteturas manycore e do uso da virtualização de gpus. In de Computação (SBC), S. B., editor, XXXIV Congresso da Sociedade Brasileira de Computação, pages 1837–1850.

NVIDIA (2013). Cuda toolkit documentation. http://docs.nvidia.com/cuda/.
Published
2015-07-20
GONÇALVES, Nielsen; COSTA, Carlos; ARAÚJO, Josivaldo; COSTA, Jessé; PANETTA, Jairo. Comparação e Análise de Desempenho de Aceleradores Gráficos no Processamento de Matrizes. In: WORKSHOP ON PERFORMANCE OF COMPUTER AND COMMUNICATION SYSTEMS (WPERFORMANCE), 14. , 2015, Recife. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2015 . p. 43-55. ISSN 2595-6167. DOI: https://doi.org/10.5753/wperformance.2015.10396.