Integrating CUDA memory management mechanisms for domain decomposition of an acoustic wave kernel implemented in OpenMP
Resumo
OpenMP is a well-known tool for parallelizing code in a directive-based programming model. While it has been extended to include support for offloading for devices such as GPUs, multi-gpu programming using data map directives requires redundant data allocation and non-intuitive data synchronization. This paper studies an alternative implementation of a CUDA-OpenMP hybrid kernel using native Unified Virtual Addressing memory pointers in an OpenMP target kernel.
Referências
Nvidia (2023). Cuda runtime api :: Cuda toolkit documentation. [link]. Last accessed on 22nd April, 2023.
OpenMP (2021). Openmp application programming interface specification version 5.2. [link]. Last accessed on 3rd April, 2023.
Virieux, J. and Operto, S. (2009). An overview of full-waveform inversion in exploration geophysics. Geophysics, 74(6):WCC1–WCC26.