Controle de Granularidade com threads em Programas MPI Dinâmicos
Resumo
O controle de granularidade é um fator importante no desempenho de programas paralelos. Problemas estáticos adaptam sua granularidade pela decomposição e atribuição de dados a cada tarefa, mas em irregulares não é possível prever a carga de trabalho antes da execução. Os irregulares que utilizam decomposição recursiva, como ordenação, necessitam de dinamismo com suporte a criação de tarefas sob demanda. Alguns ambientes de programação, como Cilk e KAAPI, oferecem dinamismo e trabalham com granularidade através do conceito abstrato de tarefa porém possuem limitações que dificultam seu uso em PAD. O MPI, padrão de fato em PAD, oferece dinamismo de processos e uso de threads mas atribui à implementação especificar o comportamento na criação de uma tarefa. Este trabalho propõe investigar as vantagens no controle de granularidade com threads em programas MPI dinâmicos, através da substituição da criação de processos por tarefas onde um mecanismo (libSpawn) decide entre lançar processo(s) ou thread(s). Os resultados obtidos com o programa de ordenação Cilksort, que segue o modelo Divisão-e-Conquista, demonstram ganhos de até 85% em criação de tarefas e comunicação.Referências
M. A. Bender and M. O. Rabin. Online Scheduling of Parallel Programs on Heterogeneous Systems with Applications to Cilk. Theory of Computing Systems Special Issue on SPAA00, 35:289–304, 2002.
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In PPOPP ’95: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 207–216, New York, NY, USA, July 1995. ACM Press.
N. Capit, G. D. Costa, Y. Georgiou, G. Huard, C. Martin, G. Mounié, P. Neyron, and O. Richard. A batch scheduler with high level components. In Cluster computing and Grid 2005 (CCGrid05), 2005.
M. C. Cera, G. P. Pezzi, E. N. Mathias, N. Maillard, and P. O. A. Navaux. Improving the Dynamic Creation of Processes in MPI-2. Lecture Notes in Computer Science - 13h European PVM/MPI Users Group Meeting, 4192/2006:247–255, Sept. 2006.
V. Danjean, R. Gillard, S. Guelton, J.-L. Roch, and T. Roche. Adaptive Loops with Kaapi on Multicore and Grid: Applications in Symmetric Cryptography. In PASCO ’07: Proceedings of the 2007 international workshop on Parallel symbolic computation, pages 33–42, New York, NY, USA, 2007. ACM.
U. Drepper. ELF Handling For Thread-Local Storage. http://people.redhat.com/drepper/tls.pdf, December 2005.
M. P. I. Forum. MPI-2: Extensions to the Message-Passing Interface. Technical Report CDA-9115428, University of Tennessee, Knoxville, Tennessee, July 1997.
I. Foster. Designing and Building Paralllel Programs. Addison-Wesley, 1995.
I. Foster, C. Kesselman, and S. Tuecke. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications, 15(3):200–222, 2001.
M. Frigo, C. E. Leiserson, and K. H. Randall. The Implementation of the Cilk-5 Multithreaded language. In Proceedings of the ACM SIGPLAN ’98 Conference on Programming Language Design and Implementation, pages 212–223, Montreal, Quebec, Canada, jun 1998.
E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. Lecture Notes in Computer Science - 13h European PVM/MPI Users Group Meeting, 3241/2004:97–104, Nov. 2004.
T. Gautier, X. Besseron, and L. Pigeon. KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors. In PASCO ’07: Proceedings of the 2007 international workshop on Parallel symbolic computation. ACM, 2007.
A. Geist,W. Gropp, S. Huss-Lederman, A. Lumsdaine, E. L. Lusk, W. Saphir, T. Skjellum, and M. Snir. MPI-2: Extending the Message-Passing Interface. In L. Bougé, P. Fraigniaud, A. Mignotte, and Y. Robert, editors, Euro-Par, Vol. I, volume 1123, Lyon, France, aug 1996. Springer.
A. Grama, A. Gupta, G. Karypis, and V. Kumar. Introduction to Parallel Computing. Addison-Wesley, 2th edition, 2003.
W. Gropp. MPICH2: A New Start for MPI Implementations. Lecture Notes in Computer Science - 9th European PVM/MPI Users Group Meeting, 2474/2002:37–42, Sept. 2002.
W. Gropp and R. Thakur. Thread-safety in an MPI implementation: Requirements and analysis. Parallel Computing, 33(9):595–604, Sept. 2007.
G. P. Pezzi, M. C. Cera, E. Mathias, N. Maillard, and P. O. A. Navaux. On-line Scheduling of MPI-2 Programs with Hierarchical Work Stealing. SBAC-PAD 2007: 19th International Symposium on Computer Architecture and High Performance Computing, 2007., pages 247–254, 2007.
G. P. Pezzi, M. C. Cera, E. N. Mathias, N. Maillard, and P. O. A. Navaux. Escalonamento Dinâmico de programas MPI-2 utilizando Divis˜ao e Conquista. WSCAD’06 - Workshop em Sistemas Computacionais de Alto Desempenho, 7:71–78, 2006.
G. Rünger. Parallel Programming Models for Irregular Algorithms. Lecture Notes in Computational Science and Engineering - Parallel Algorithms and Cluster Computing, 52:3–23, 2006.
R. Thakur and W. Gropp. Test Suite for Evaluating Performance of MPI Implementations That Support MPI THREAD MULTIPLE. Lecture Notes in Computer Science - 14h European PVM/MPI Users Group Meeting, 4757:46–55, Sept. 2007.
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An Efficient Multithreaded Runtime System. In PPOPP ’95: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 207–216, New York, NY, USA, July 1995. ACM Press.
N. Capit, G. D. Costa, Y. Georgiou, G. Huard, C. Martin, G. Mounié, P. Neyron, and O. Richard. A batch scheduler with high level components. In Cluster computing and Grid 2005 (CCGrid05), 2005.
M. C. Cera, G. P. Pezzi, E. N. Mathias, N. Maillard, and P. O. A. Navaux. Improving the Dynamic Creation of Processes in MPI-2. Lecture Notes in Computer Science - 13h European PVM/MPI Users Group Meeting, 4192/2006:247–255, Sept. 2006.
V. Danjean, R. Gillard, S. Guelton, J.-L. Roch, and T. Roche. Adaptive Loops with Kaapi on Multicore and Grid: Applications in Symmetric Cryptography. In PASCO ’07: Proceedings of the 2007 international workshop on Parallel symbolic computation, pages 33–42, New York, NY, USA, 2007. ACM.
U. Drepper. ELF Handling For Thread-Local Storage. http://people.redhat.com/drepper/tls.pdf, December 2005.
M. P. I. Forum. MPI-2: Extensions to the Message-Passing Interface. Technical Report CDA-9115428, University of Tennessee, Knoxville, Tennessee, July 1997.
I. Foster. Designing and Building Paralllel Programs. Addison-Wesley, 1995.
I. Foster, C. Kesselman, and S. Tuecke. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications, 15(3):200–222, 2001.
M. Frigo, C. E. Leiserson, and K. H. Randall. The Implementation of the Cilk-5 Multithreaded language. In Proceedings of the ACM SIGPLAN ’98 Conference on Programming Language Design and Implementation, pages 212–223, Montreal, Quebec, Canada, jun 1998.
E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. Lecture Notes in Computer Science - 13h European PVM/MPI Users Group Meeting, 3241/2004:97–104, Nov. 2004.
T. Gautier, X. Besseron, and L. Pigeon. KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors. In PASCO ’07: Proceedings of the 2007 international workshop on Parallel symbolic computation. ACM, 2007.
A. Geist,W. Gropp, S. Huss-Lederman, A. Lumsdaine, E. L. Lusk, W. Saphir, T. Skjellum, and M. Snir. MPI-2: Extending the Message-Passing Interface. In L. Bougé, P. Fraigniaud, A. Mignotte, and Y. Robert, editors, Euro-Par, Vol. I, volume 1123, Lyon, France, aug 1996. Springer.
A. Grama, A. Gupta, G. Karypis, and V. Kumar. Introduction to Parallel Computing. Addison-Wesley, 2th edition, 2003.
W. Gropp. MPICH2: A New Start for MPI Implementations. Lecture Notes in Computer Science - 9th European PVM/MPI Users Group Meeting, 2474/2002:37–42, Sept. 2002.
W. Gropp and R. Thakur. Thread-safety in an MPI implementation: Requirements and analysis. Parallel Computing, 33(9):595–604, Sept. 2007.
G. P. Pezzi, M. C. Cera, E. Mathias, N. Maillard, and P. O. A. Navaux. On-line Scheduling of MPI-2 Programs with Hierarchical Work Stealing. SBAC-PAD 2007: 19th International Symposium on Computer Architecture and High Performance Computing, 2007., pages 247–254, 2007.
G. P. Pezzi, M. C. Cera, E. N. Mathias, N. Maillard, and P. O. A. Navaux. Escalonamento Dinâmico de programas MPI-2 utilizando Divis˜ao e Conquista. WSCAD’06 - Workshop em Sistemas Computacionais de Alto Desempenho, 7:71–78, 2006.
G. Rünger. Parallel Programming Models for Irregular Algorithms. Lecture Notes in Computational Science and Engineering - Parallel Algorithms and Cluster Computing, 52:3–23, 2006.
R. Thakur and W. Gropp. Test Suite for Evaluating Performance of MPI Implementations That Support MPI THREAD MULTIPLE. Lecture Notes in Computer Science - 14h European PVM/MPI Users Group Meeting, 4757:46–55, Sept. 2007.
Publicado
29/10/2008
Como Citar
LIMA, João Vicente F.; MAILLARD, Nicolas.
Controle de Granularidade com threads em Programas MPI Dinâmicos. In: SIMPÓSIO EM SISTEMAS COMPUTACIONAIS DE ALTO DESEMPENHO (SSCAD), 9. , 2008, Campo Grande.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2008
.
p. 117-124.
DOI: https://doi.org/10.5753/wscad.2008.17675.