Investigating the Relative Performance of Static and Dynamic Instruction Scheduling

  • Daniel Tate University of Hertfordshire
  • Gordon Steven University of Hertfordshire
  • Paul Findlay University of Hertfordshire

Resumo


There are two distinct groups of research into ILP. Those that strongly favour static instruction scheduling and those that favour dynamic instruction scheduling. This paper introduces powerful static and dynamic scheduling models and combines them within the framework of a single simulation environment. Both individual models achieve respectable speedups; dynamic schedullng significantly out-performs static scheduling when an idealised processor model with perfect branch prediction is used. However, when a realistic branch predictor is substituted, the roles are reversed, and static scheduling achieves the higher performance. Similarly, static scheduling performs better in the absence of branch prediction or when processor resources are restricted. Finally, we combine static scheduling with out-of-order instruction issue. Disappointingly, when an ideal out-of-order processor is used, scheduled code fails to match the performance of unscheduled code. Furthermore, with realistic branch predictlon, out-of-order issue fails to improve the performance of scheduled code.

Palavras-chave: Hlgh Performance Processors, Instruction Scheduling, Dynamic Schedullng, Multiple Instruction Issue

Referências

ADVE S V, et al. Changing Interaction of Compiler and Architecture. Computer Magazine, Vol.30 No.12, December 1997. pp 51-58.

CHANG P P, CHEN W, MAHLKE S, HWU W. Comparing Static and Dynamic Code Scheduling for Multiple-lnstruction-Issue Processors. Micro-24, Albuquerque, New Mexico, November 1991. pp 25-33.

CHANG P P, MAHLKE S, CHEN W, WARTER N J. HWU W. IMPACT: An Architectural Framework for Multiple-Instruction-Issue Processors. 18th Annual International Symposium on Computer Architecture. Toronto, May 1991. pp 266-275.

COLLINS R Developing a Simulator for the Hatfield Superscalar Processor. University of Hertfordshire: Technical Report No.172, December 1993.

FISHER J A. Trace Scheduling: A technique for global microcode compaction. IEEE Transactions or Computers, Vol.C-30 No.7, July 1981. pp 37-47.

KAELI D R, EMMA P G. Branch History Table Prediction of Moving Target Branches due to Subroutine Returns. 18th Annual International Symposium on Computer Architecture, Toronto, May 1991. pp 34-41.

LENELL J, BAGHERZADEH N. A Performane Comparison of Several Superscalar Processor Model with a VLIW Processor. Microprocessors and Microsystems, Vol.18 No.3, April 1994. pp 131 -139.

LOVE C E, JORDAN H F. An Investigation of Static Versus Dynamic Scheduling. 17th Annual International Symposium on Computer Architecture, Seattle, Washington, June 1990. pp 192-201.

MELVIN S, PATI Y N. Exploiting Fine-Grained Parallelism Through a Combination of Hardware and Software Techniques. 18th Annual International Symposium on Computer Architecture, Toronto, Canada, May 1991. pp 287-296.

POTTER R D. Exploring the Limitations of the Fine Grained Parallelism in a Superscalar Architecture. PhD Thesis, University of Hertfordshire, 1999.

STEVEN F L, ADAMS R G, STEVEN G B, WANG L, WHALE D J. Addressing Mechanisms for VLIW and Superscalar Processors. Microprocessing and Microprogramming, Vol.39 Numbers 2-5, December 1993. pp 75-78.

STEVEN F L. An Introduction to the Hatfield Superscalar Scheduler. University of Hertfordshire Technical Report No.316, Spring 1998.

STEVEN G B, ADAMS R G, FINDLA Y P A, TRAINIS S A. iHARP: A Multiple Instruction Issue Processor. IEE Proceedings, Part E, Computers and Digital Techniques, Vol.139 No.5, September 1992. pp 439-449.

STEVEN G B, CHRISTIANSON D B, COLLINS R, POTTER R D, STEVEN F L. A Superscalar Architecture to Exploit Instruction-Level Parallelism. Microprocessors and Microsystems, Vol.20 No.7, March 1997. pp 391-400.

TATE D. Out-of-Order Instruction Issue and its Integration into the Hatfield Superscalar Architecture. University of Hertfordshire Technical Report No.330, April1999.

TOMASULO R M. An Efficient Algorithm for Exploiting Multiple Arithmetic Units. IBM Journal of Research and Development, January 1967. pp 25-33.

YEH T, PATI Y N. Altemative Implementations of Two-Level Adaptive Branch Prediction. 19th Annual International Symposium on Computer Architecture, Gold Coast, Australia, May 1992. pp 124-134.
Publicado
29/09/1999
TATE, Daniel; STEVEN, Gordon; FINDLAY, Paul. Investigating the Relative Performance of Static and Dynamic Instruction Scheduling. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 11. , 1999, Natal. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 1999 . p. 183-190. DOI: https://doi.org/10.5753/sbac-pad.1999.19788.