A Survey of Load Balancers in Modern Multi-Threading Systems

  • Prasad Kakulavarapu McGill University
  • José Nelson Amaral University of Delaware

Resumo


Multithreaded architectures are a feasible approach to exploit parallelism in both regular and irregular applications. Today a large collection of multi-threading architectures with different threaded models, and implementation platforms are available. These architectures provide support for multithreading either at hardware level, with customized functional units, or at the software level, as emulators written in some high-level language. The later approach is usually prefered because of its favorable price tag, speed of development, and portability. In this article we review some of these architectures focusing in their capabilities to provide load balancing for irregular, data-parallel and recursive applications. The paper is anchored on a description of our own implementation of load balancers for EARTH - Efficient Architecture for Running Threads. Most of the multithreading architectures that we review are software emulations based on off-the-shelf hardware and compiler technologies.

Palavras-chave: Fine-grain parallelism, multithreading, non-blocking threads, context-switching, runtime system, distributed memory, dynamic load balancing

Referências

Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porterfield, and Bunon Smith. The Tera Computer System. In Proc., of Intl. Conf on Supercomputing, Amsterdam, The Netherlands. pages 1-6, June 1990.

Eduard Ayguade', Mario Fumari, Maurizio Giordano, Hans-Christian Hoppe, Jesus Labarta, Xavier Martorell, Nacho Navarro, Dimitrios Nikolopoulos. Theodore Papatheodorou, and Eleftherios Polychronopoulos. Nano-Threads: Programming Model Specification. In Deliverable Ml.Dl. ESPRIT Project NANOS (No. 21907). University of Patras, July 1997.

Robert Blumofe and Charles Leiserson. Scheduling Multithreaded Computations by Work Stealing. In Proc. of the 35th Annual Symposium of foundations of Computer Science (FOCS), Santa Fe, New Mexico, pages 356-368, November 1994.

Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, and Keith H. Randall. An analysis of dag-consistent distributed shared-memory algorithms. In Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 297-308, Padua, Italy, June 24-26, 1996. SIGACT/SIGARCH.

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. Cilk: An Efficient Multithreaded Runtime System. Journal of Parallel and Distributed Computing, 37(1):55-69, August 1996.

Haiying Cai, Olivier Maquelin, Prasad Kakulavarapu, and Guang R. Gao. Design and Evaluation of Dynamic Load Balancing Schemes under a Fine-grain Multithreaded Execution Model. In Proc. of the Multithreaded Execution Architecture and Compilation Workshop. Orlando, Florida, January 1999.

Gerson G. H. Cavalheiro, Francois Galilee, and Jean-Louis Roch. Athapascan-1: Parallel Programming with Asynchronous Tasks. In Proc. of Workshop on Multithreading, Yale University, 1998.

Guang-len Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, and Andrew F. Stark. Detecting Data Races in Cilk Programs that Use Locks. In Proc. of 10th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '98), Puerto Vallarta, Mexico, pages 298-309. June 1998.

David E. Culler, Anurag Sah, Klaus Erik Schauser, Thorsten von Eicken, and John Wawrzynek. Fine-grain Parallelism with Minimal Hardware Suppon: A Compiler-Controlled Threaded Abstract Machine. In Proc. of the Fourth Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, Santa Clara. CA, April 1991.

David E. Culler, Anumg Sah, Klaus Erik Schauser, Thorsten von Eicken, and John Wawrzynek. Fine-grain parallelism with minimal hardware suppon: A compiler-controlled threaded abstract machine. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 164-175, Santa Clara, California, April 8-11, 1991. ACM SIGARCH. SIGPLAN, SIGOPS, and the IEEE Computer Society. Computer Architecture News, 19(2), April 1991; Operating Systems Review, 25, April 1991; SIGPLAN Notices, 26(4), Apri1 1991.

Susan Eggers, Joel Emer, Henry Levy, Jack Lo, Rebecca Stamm, and Dean Tullsen. Simultaneous Multithreading: A Platform for Next-generation Processors. In Proc. of IEEE Micro, pages 12-18. sept 1997.

Vincent W. Freeh, David K. Lowenthal, and Gregory R. Andrews. Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations. In Proc. of the First Symposium on Operating Systems Design and Implementation. Usenix Association. November 1994.

Matteo Frigo. Charles E. Leiserson, and Keith H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In Proc. of ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1998.

Guang R. Gao. An Efficient Hybrid Dataflow Architecture Model. Journal of Parallelism, 19(4). December 1993.

Guang R. Gao, Herben H. J. Hum, and Yue-Bong Wong. Parallel Function Invocation in a Dynamic Argument-Fetching Dataflow Architecture. In Proc. of PARBASE-90: Intl.. Conf. on Databases Parallel Architectures, and their Applications, Miami Beach, Florida, pages 112-116, March 1990.

L. J. Hendren. X. Tang, Y. Zhu, G. R. Gao, X. Xue, H. Cai, and P. Ouellet. Compiling C for the EARTH Multithreaded Architecture. In Proc. of the 1996 Conf on Parallel Architectures and Compilation Techniques (PACT'96), Boston, Mass.Intl. Journal of Parallel Programming, pages 12-23. October 1996.

Herbert H. J. Hum, Olivier Maquelin, Kevin B. Theobald, Xinmin Tian, Guang R. Gao, and Laurie J. Hendren. A study of the EARTH-MANNA multithreaded system. International Journal of Parallel Programming, 24(4):319-347, August 1996.

Herbert H. J. Hum, Olivier Maquelin, Kevin B. Theobald, Xinmin Tian, Xinan Tang, Guang R. Gao, Phil Cupryk. Nasser Elmasri, Lauric J. Hendren, Albeno Jimenez, Shoba Krishnan, Andres Marquez. Sharnir Merali, Shashank S. Nemawarkar, Prakash Panangaden, Xun Xue, and Yingchun Zhu. A design study of the EARTH multiprocessor. In Lubomir Bic, Wim Bohm, Paraskevas Evripidou, and Jean-Luc Gaudiot, editors, Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architecture and Compilation Techniques, PACT '95, pages 59-68, Limassol, Cyprus, June 27-29. 1995. ACM Press.

Herbert Hing-Jing Hum. The Super-Actor Machine: a Hybrid Dataflow/von Neumann Architecture. PhD thesis, McGill University, Montréal, Québec, May 1992.

Prasad Kakulavarapu, Olivier Maquelin, Jose Nelson Amaral, and Guang R. Gao. A Runtime System for Fine-grain Multithreaded Multiprocessor Systems. In Technical Memo 24. CAPSL. University of Delaware, may 1999.

Olivier C. Maquelin. Load Balancing and Resource Management in the ADAM Machine. In Second Workshop on Dataflow Computing, Hamilton lsland, Australia, 1992, Published in Advanced Topics in Dataflow Computing and Multithreading. Lubomir Bic. Guang R. Gao, Jean-Luc Gaudiot editors, IEEE Computer Society, 1995.

Olivier C. Maquelin, Herben H. J. Hum, and Guang R. Gao. Costs and Benefits of Multithreading with Off-the-Shelf RISC Processors. In Proc. of the First International EURO-PA R Confuence, No. 966 in Lecture Notes in Computer Science, Stockholm, Sweden, pages 117-128, August 1995.

Edward Mascarenhas and Vemon Rego. Ariadne: Architecture of a Ponable Threads System supponing Thread Migration. Software - Practice and Experience, 26(3):327-356, March 1996.

Piyush Mehtotra and Matthew Haines. An Overview of the Opus Language and Runtime System. Technical repon. May 1994.

Klaus Erik Schauser, David E. Culler, and Thorsten von Eickcn. Compiler-Controlled Multithreading for Lenient Parallel Languages. In Proc. of FPCA '91 Conference on Functional Programming Languages and Computer Architecture, Springer Verlag, aug 1991.

Kevin B. Theobald. EARTH - An Efficient Architecture for Running THreads. In Ph.D Thesis. School of Computer Science. McGill University Montréal. Québec. March 1999.

John Thomley, K. Mani Chandy, and Hiroshi Ishii. A System for Structured High-Performance Multithreaded Programming in Windows NT. In Proc. of the 2nd USENIX Windows NT Symposium, pp. 67-76, Seattle, Washington. August 1998.

John Plevyak Vijay Karamcheti and Andrew A. Chien. Runtime Mechanisms for Efficient Dynamic Multithreading. Journal of Parallel and Distributed Computing, 37:21-40, August 1996.

Boris Weissman. Active Threads: an Extensible and Ponable LightWeight Thread System. Technical report. September 1997.
Publicado
29/09/1999
KAKULAVARAPU, Prasad; AMARAL, José Nelson. A Survey of Load Balancers in Modern Multi-Threading Systems. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 11. , 1999, Natal. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 1999 . p. 10-16. DOI: https://doi.org/10.5753/sbac-pad.1999.19766.