Performance Evaluation of Checkpoint and Rollback-Recovery Algorithms for Distributed Systems
Resumo
In distributed systems, backward recovery has the synchronous and asynchronous approaches as the two main implementation paradigms. In this paper we compare two representative algorithms on these groups and present some theoretical results. Koo & Toueg synchronous algorithm and Juang & Venkatesan asynchronous algorithm have been chosen for this purpose. Our goal is to demonstrate that the advantages and disadvantages between them are mainly related to the characteristics of the applications.
Referências
Jalote, P. Fault Tolerance in Distributed Systems. New Jersey: Prentice-Hall, 1994.
Juang, T.; Venkatesan, S. Crash Recovery with Little Overhead. Int'l. Conf. on Distributed Computing Systems. Proceedings. May 1991. pp.454-461.
Koo, R; Toueg. S. Checkpointing and Rollback-Recovery for Distributed Systems. IEEE Trans. on Software Engineering, v.SE-13(1):23-31, Jan. 1987.
Singhal, M.; Shivaratri, N. Advanced Concepts in Operating Systems. New York: McGraw-Hill, 1994.