Workload Balancing Methodology for Data-Intensive Applications with Divisible Load

  • Claudia Rosas Universitat Autònoma Barcelona
  • Anna Morajko Universitat Autònoma Barcelona
  • Josep Jorba Universitat Oberta de Catalunya
  • Eduardo Cesar Universitat Autònoma Barcelona

Resumo


Data-intensive applications are those that explore, query, analyze, and, in general, process very large data sets. Generally in High Performance Computing (HPC), the main performance problem associated to these applications is the load unbalance or inefficient resources utilization. This paper proposes a methodology for improving performance of data-intensive applications based on performing multiple data partitions prior to the execution, and ordering the data chunks according to their processing times during the application execution. As a first step, we consider that a single execution includes multiple related explorations on the same data set. Consequently, we propose to monitor the processing of each exploration and use the data gathered to dynamically tune the performance of the application. The tuning parameters included in the methodology are the partition factor of the data set, the distribution of these data chunks, and the number of processing nodes to be used by the application. The methodology has been initially tested using the well-known bioinformatics tool BLAST, obtaining encouraging results (up to a 40% of improvement).
Palavras-chave: Tuning, Load management, Databases, Monitoring, Bioinformatics, Phase measurement, load balancing, data-intensive, DLT
Publicado
26/10/2011
ROSAS, Claudia; MORAJKO, Anna; JORBA, Josep; CESAR, Eduardo. Workload Balancing Methodology for Data-Intensive Applications with Divisible Load. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 23. , 2011, Vitória/ES. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2011 . p. 48-55.