Convergence of HPC and Big Data in extreme-scale data analysis through the DCEx programming model

  • Javier Garcia-Blas University Carlos III of Madrid
  • Javier Fernandez Muñoz University Carlos III of Madrid
  • Jesus Carretero University Carlos III of Madrid
  • Fabrizio Marozzo University of Calabria
  • Domenico Talia University of Calabria
  • Paolo Trunfio University of Calabria
  • Alberto Fernandez-Pena University Carlos III of Madrid
  • Daniel Martín de Blas University Carlos III of Madrid

Resumo


High-level programming models can help application developers to access and use resources without the need to manage low-level architectural entities, as a parallel programming model defines a set of programming abstractions that simplify the way by which a programmer structures and expresses her/his algorithm. Early proposals of Exascale programming tools are based on the adaptation of traditional parallel programming languages and hybrid solutions. This incremental approach is too conservative, often resulting in very complex code. This paper describes the design features, the programming constructs, and the runtime mechanisms of the Data Centric programming model for Exascale systems (DCEx). DCEx is based on structuring applications into data-parallel blocks. Blocks are units of shared-and distributed-memory parallel computation, communication, and migration in the memory/storage hierarchy. Blocks and their message queues are mapped onto processes and placed in memory/storage by the DCEx runtime. Those data-parallel blocks are orchestrated by using distributed parallel patterns that simplify the development cost. DCEx aims to reach the convergence of traditional HPC programming models, mainly based on MPI, with the emerging technologies based on the data intensive paradigms. To demonstrate the potential of DCEx, we carried out an experimental evaluation developing a real-world diffusion-weighted magnetic resonance imaging data processing application in a neuroimaging research context.
Palavras-chave: Big Data, HPC convergence, Programming model, PGAS, parallel patterns
Publicado
02/11/2022
GARCIA-BLAS, Javier; MUÑOZ, Javier Fernandez; CARRETERO, Jesus; MAROZZO, Fabrizio; TALIA, Domenico; TRUNFIO, Paolo; FERNANDEZ-PENA, Alberto; BLAS, Daniel Martín de. Convergence of HPC and Big Data in extreme-scale data analysis through the DCEx programming model. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 34. , 2022, Bordeaux/France. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 130-139.