Melhorando o Desempenho de Algoritmos do Tipo Branch & Bound em MPI via Escalonador com Roubo Aleatório de Tarefas

Stéfano D. K. Mór; Nicolas Maillard

doi:10.5753/wscad.2009.17387

Stéfano D. K. Mór UFRGS
Nicolas Maillard UFRGS

DOI: https://doi.org/10.5753/wscad.2009.17387

Resumo

Nossa principal contribuição é a integração de um modelo de escalonamento distribuído por roubo de tarefas para computação em MPI capaz de otimizar o desempenho de programas do tipo Branch & Bound. Esse escalonador é introduzido em tempo de compilação e é independente da distribuição MPI usada. Resultados experimentais mostram que se pode obter um ganho de até 80% no desempenho, mantendo o speedup próximo ao linear e sem a perda do consumo linear de memória. Esses ganhos se confirmam mesmo em um ambiente de processadores homogêneos, que tendem a produzir um menor desbalanceamento da carga de trabalho.

Referências

R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science. MIT Laboratory for Computer Science, 1994.

R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of ACM, 46(5):720–748, September 1999.

R. P. Brent. The parallel evaluation of general arithmetic expressions. J. ACM, 21(2):201–206, 1974.

P. Brucker. Scheduling Algorithms. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2001.

F. W. Burton and M. R. Sleep. Executing functional programs on a virtual tree of processors. In FPCA ’81: Proceedings of the 1981 conference on Functional programming languages and computer architecture, pages 187–194, New York, NY, USA, 1981. ACM.

T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Second Edition. The MIT Press, September 2001.

R. Feldmann, P. Mysliwietz, and B. Monien. A fully distributed chess program. Advances in Computer Chess VI, pages 1–27, 1991.

W. Gropp, E. Lusk, and A. Skjellum. Using MPI - Portable Parallel Programming with Message-Passing Interface. Scientific and Engineering Computation Series. The MIT Press, Massachusetts Institue of Technology - Cambridge, Massachusetts 02142, 2nd edition, 1999.

R. H. Halstead, Jr. Implementation of multilisp: Lisp on a multiprocessor. In LFP ’84: Proceedings of the 1984 ACM Symposium on LISP and functional programming, pages 9–17, New York, NY, USA, 1984. ACM.

C. F. Joerg, M. Halbherr, and Y. Zhou. MIMD-style parallel programming based on continuation-passing threads. In Massachusetts Institute of Technology, Laboratory for Computer Science, page pp., 1994.

H. Keller, U. Pferschy, and D. Pisinger. Knapsack Problems. Springer, 2005.

C. E. Leiserson and B. C. Kuszmaul. Synchronized MIMD computing. Technical report, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 1994.

J. V. F. Lima and N. Maillard. Controle de granularidade com threads em programas mpi dinâmicos. In SBC, editor, IX Simpósio em Sistemas Computacionais - WSCAD-SSC 2008, pages 117–124, Campo Grande, Brasil, Nov 2008.

P. Liu, W. Aiello, and S. Bhatt. An atomic model for message-passing. In SPAA ’93: Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures, pages 154–163, New York, NY, USA, 1993. ACM.

E. Mohr, D. A. Kranz, and R. H. Halstead, Jr. Lazy task creation: a technique for increasing the granularity of parallel programs. In LFP ’90: Proceedings of the 1990 ACM conference on LISP and functional programming, pages 185–197, New York, NY, USA, 1990. ACM.

P. O. A. Navaux and C. A. F. D. Rose. Arquiteturas Paralelas. Number 15 in Série Livros Didáticos. Editora Sagra Luzzato, 1st edition, 2003.

P. S. Pacheco. Paralell Programming With MPI. Morgan Kaufmann Publishers, 1st edition, 1997.

K. H. Randall. Cilk: Efficient Multithreaded Computing. PhD thesis, MIT, Jun 1998.

A. S. Tanenbaum. Modern Operating Systems. Prentice Hall Press, Upper Saddle River, NJ, USA, 2007.

D. Traoré, J.-L. Roch, N. Maillard, T. Gautier, and J. Bernard. Deque-free work-optimal parallel STL algorithms. In Euro-Par ’08: Proceedings of the 14th international EuroPar conference on Parallel Processing, pages 887–897, Berlin, Heidelberg, 2008. Springer-Verlag.

I.-C. Wu. Efficient parallel divide-and-conquer for a class of interconnection topologies. In ISA ’91: Proceedings of the 2nd International Symposium on Algorithms, pages 229–240, London, UK, 1991. Springer-Verlag.

I.-C. Wu and H. T. Kung. Communication complexity for parallel divide-and-conquer. In In Proceedings of the 32nd Annual Symposium on Foundations of Computer Science, pages 151–162, 1991.