Portabilidade e Eficiência do Método Fletcher de Aplicações Sísmicas em Arquiteturas Multicore e GPU
Resumo
A simulação da propagação de ondas acústicas é a base das ferramentas de imagem sı́smica utilizadas pela indústria de petróleo e gás. Para realizar tais simulações, arquiteturas de CAD são empregadas, fornecendo resultados mais rápidos e com maior precisão a cada geração de processadores. Entretanto, para atingir alto desempenho nessas arquiteturas, vários desafios devem ser levados em consideração no momento do desenvolvimento da aplicação. Neste artigo, a Modelagem Fletcher foi otimizada para multicore e GPU e o desempenho, o consumo de energia e a eficiência energética de oito versões do código foram avaliados. Os resultados mostram que a versão CUDA tem o melhor desempenho e eficiência energética; no entanto, a versão OpenACC que tem a vantagem da portabilidade, tem um desempenho e degradação de eficiência energética de apenas 10 e 8% comparado com CUDA. ∗
Referências
Caballero, D., Farrés, A., Duran, A., Hanzich, M., Fernández, S., and Martorell, X. (2015). Optimizing Fully Anisotropic Elastic Propagation on Intel Xeon Phi Coprocessors. In 2nd EAGE Workshop on HPC for Upstream, pages 1–6.
Carrijo Nasciutti, T., Panetta, J., and Pais Lopes, P. (2018). Evaluating optimizations that reduce global memory accesses of stencil computations in gpgpus. Concurrency and Computation: Practice and Experience, page e4929.
Castro, M., Francesquini, E., Dupros, F., Aochi, H., Navaux, P. O. A., and Méhaut, J.-F. (2016). Seismic wave propagation simulations on low-power and performance-centric manycores. Parallel Computing, 54.
Chandra, R., Dagum, L., Kohr, D., Menon, R., Maydan, D., and McDonald, J. (2001) Parallel programming in OpenMP. Morgan kaufmann.
Clapp, R. G. (2015). Seismic Processing and the Computer Revolution(s). In Society of Exploration Geophysicists (SEG) Technical Program Expanded Abstracts 2015, pages 4832–4837.
Clapp, R. G., Fu, H., and Lindtjorn, O. (2010). Selecting the right hardware for reverse time migration. The Leading Edge, 29(1).
Fletcher, R. P., Du, X., and Fowler, P. J. (2009). Reverse time migration in tilted transversely isotropic (tti) media. Geophysics, 74(6):WCA179–WCA187.
J. Dongarra, H. M. and Strohmaier, E. (2019). Top500 supercomputer: June 2019. https://www.top500.org/lists/2019/06/. [Acesso em: 10 Jul. 2019].
Kukreja, N., Louboutin, M., Vieira, F., Luporini, F., Lange, M., and Gorman, G. (2016) Devito: Automated fast finite difference computation. In Procs. of the 6th Intl. Workshop on Domain-Spec. Lang. and High-Level Frameworks for HPC, WOLFHPC ’16, pages 11–19. IEEE Press.
Lukawski, M. Z., Anderson, B. J., Augustine, C., Capuano Jr, L. E., Beckers, K. F., Livesay, B., and Tester, J. W. (2014). Cost analysis of oil, gas, and geothermal well drilling. Journal of Petroleum Science and Engineering, 118:1–14.
Memeti, S., Li, L., Pllana, S., Kołodziej, J., and Kessler, C. (2017). Benchmarking opencl, openacc, openmp, and cuda: programming productivity, performance, and energy consumption. In Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, pages 1–6. ACM.
Niu, X., Jin, Q., Luk, W., and Weston, S. (2014). A Self-Aware Tuning and SelfAware Evaluation Method for Finite-Difference Applications in Reconfigurable Systems. ACM Trans. on Reconf. Technology and Systems, 7(2). Nvidia (2016). Developer Zone - CUDA Toolkit Documentation.
Ott, R. L. and Longnecker, M. T. (2015). An introduction to statistical methods and data analysis. Nelson Education.
Pavan, P. J., Serpa, M. S., Padoin, E. L., Schnorr, L. M., Navaux, P. O. A., and Panetta, J. (2018). Improving i/o performance of rtm algorithm for oil and gas simulation. In 2018 Symposium on High Performance Computing Systems (WSCAD), pages 270–270. IEEE.
Qutob, H. et al. (2004). Underbalanced drilling
Rubio, F., Farrés, A., Hanzich, M., de la Puente, J., and Ferrer, M. (2013). Optimizing Isotropic and Fully-anisotropic Elastic Modelling on Multi-GPU Platforms. In 75th EAGE Conference & Exhibition, pages 10–13. EAGE.
Sabne, A., Sakdhnagool, P., Lee, S., and Vetter, J. S. (2014). Evaluating performance portability of openacc. In International Workshop on Languages and Compilers for Parallel Computing, pages 51–66. Springer.
Sanders, J. and Kandrot, E. (2010). CUDA by example: an introduction to generalpurpose GPU programming. Addison-Wesley Professional.
Serpa, M. S., Cruz, E. H., Diener, M., Krause, A. M., Navaux, P. O. A., Panetta, J., Farrés, A., Rosas, C., and Hanzich, M. (2019a). Optimization strategies for geophysics models on manycore systems. The International Journal of High Performance Computing Applications, 33(3):473–486.
Serpa, M. S., Moreira, F. B., Navaux, P. O., Cruz, E. H., Diener, M., Griebler, D., and Fernandes, L. G. (2019b). Memory performance and bottlenecks in multicore and gpu architectures. In 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pages 233–236. IEEE.
Slaight, T. (2002). Platform management ipmi controllers, sensors, and tools. In Intel Developer Forum.
Subramaniam, B., Saunders, W., Scogland, T., and Feng, W.-c. (2013). Trends in energyefficient computing: A perspective from the green500. In 2013 International Green Computing Conference Proceedings, pages 1–8. IEEE.
Terpstra, D., Jagode, H., You, H., and Dongarra, J. (2010). Collecting performance data with papi-c. In Tools for High Performance Computing 2009, pages 157–173. Springer.
Wienke, S., Springer, P., Terboven, C., and an Mey, D. (2012). Openacc—first experiences with real-world applications. In European Conference on Parallel Processing, pages 859–870. Springer.
Witten, I. H., Frank, E., Hall, M. A., and Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
Yuen, D. A., Wang, L., Chi, X., Johnsson, L., Ge, W., and Shi, Y. (2013). GPU solutions to multi-scale problems in science and engineering. Springer.
Zhebel, E., Minisini, S., Kononov, A., and Mulder, W. (2013). Performance and scalability of finite-difference and finite-element wave-propagation modeling on Intel’s Xeon Phi. In Society of Exploration Geophysicists (SEG) Technical Program Expanded Abstracts 2013, pages 3386–3390.