Simultaneous Speculation Scheduling
Resumo
Simultaneous Speculation Scheduling (S3) is a combined compiler and architecture technique to control multiple path execution. lt can be applied for dual path branch speculation in case of unpredictable branches and for multiple path speculative execution of loop iterations. Loop-carried dependences are handled by data dependence prediction. Architectural requirements are a minimal form of multithreaded processor architecture and three new instructions (fork, sync, wait). Simulation results show performance gains of up to 40% over purely static scheduling techniques by applying the S3 technique to branches in kernel sections of SPECint95 benchmark programs.
Referências
A. Bolychevsky, C. R. Jesshope, and V. B. Muchnik. Dynamic scheduling in RISC architectures. IEE Proceedings Computers and Digital Techniques, 143(5):309-317, 1996.
G. Z. Chrysos and J. S. Emer. Memory dependence prediction using store sets. In Proceedings of the ISCA 25, pages 142-153. Barcelona. Spain, 1998.
C. Dulong. The IA-64 architecture at work. IEEE Computer, 31(7):24-31. July 1998.
L. Gwennap. Intel's P6 uses decoupled superscalar design. Microprocessor Report, 9(2):9-15, February 1995.
L. Gwennap. Digital 21264 sets new standard. Microprocessor Report, 10(14), October 1996.
L. Gwennap. Dansoft develops VLIW design. Microdesign Resources, pages 18-22. February 1997.
T. Heil and J. Smith. Selective dual path execution. Technical report, University of Wisconsin-Madison, http://www.engr.wisc.edu/ece/faculty/smith_james. 1996.
W.-M. Hwu. Introduction to predicated execution. IEEE Computer, 31(1):49-50, 1998.
Intel. IA-64 Application Developer's Architecture Guide. http://developer.intel.com/design/ia64/index.htm, June 1999.
Intel. IA -64 Application Instruction Set Architecture Guide Rev. 1.0. http://www.hp.com/go/ia64, June 1999.
A. Klauser, P. Abhijit, and D. Grunwald. Selective eager execution on the PolyPath architecture. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 250-259, Barcelona. Spain, June 1998.
A. Klauser, T. Austin, D. Grunwald, and B. Calder. Dynamic hammock predication for non-predicated instruction set architectures. In Proceedings of the PACT 98. pages 278-285, Paris. October 1998.
M. H. Lipasti and J. P. Shen. The performance potential of value and dependence prediction. In Lect. Notes Comput. Sci. 1300. pages 1043-1052, 1997.
M. H. Lipasri. C. B. Wilkerson, and J. P. Shen. Value locality and load value prediction. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Compilation Systems, pages 138-147, Cambridge. MA, October 1996.
S. M. Moon and K. Ebcioglu. Parallelizing nonnumerical code with selective scheduling and software pipelining. ACM Transactions on Programming Languages and Systems 19(6):853-898. 1997.
A. I. Moshovos. Memory Dependence Prediction. PhD thesis, University of Wisconsin-Madison, 1998.
S. S. Muchnick. Advanced Compiler Design & Implementation. Morgan Kaufmann Publishers, San Francisco. 1997.
L. Rauchwerger. Run-time parallelization: lt's time has come. Journal of Parallel Computing, 24(3). Special lssue on Languages & Compilers for Parallel Computers, 1998.
D. M. Tullsen. S. J. Eggers, J. S. Emer. H. M. Levy, J. L. Lo, and R. L. Stamm. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, pages 191 - 202. Philadelphia. PA. May 1996.
G. Tyson. K. Lick. and M. Farrens. Limited dual path execution. Technical Repon CSE-TR 346-97, University of Michigan, 1997.
A. K. Uht and V. Sindagi. Disjoint eager execution: An optimal form of speculative execution. In Proceedings of the 28th International Symposium on Microarchitecture. pages 313-325, Ann Arbor, MI, November 1995.
A. Unger. Th. Ungerer. and E. Zehendner. Static speculation, dynamic resolution. Proceedings of the 7th Workshop on Compilers for Parallel Computers (CPC '98), Linköping. Sweden, June 1998.
J. Silc, B. Robic, and Th. Ungerer. Processor Architecture - From Dataflow to Superscalar and Beyond. Springer-Verlag, Berlin. Heidelberg. New York. 1999.
S. Wallace, B. Calder, and D. Tullsen. Threaded multiple path execution. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 238-249, Barcelona, Spain, June 1998.
Y. Zhang, L. Rauchwerger, and J. Torrellas. Speculative parallel execution of loops with cross-iteration dependencies in in DSM multiprocessors. In Proceedings of High Performance Computer Architecture 1999 (HPCA-5), pages 135-141. Orlando, FL, 1999.