Easily Rendering Token-Ring Algorithms of Distributed and Parallel Applications Fault Tolerant

  • Luciana Arantes LIP6 / Université de Paris 6 / INRIA
  • Julien Sopena LIP6 / Université de Paris 6 / INRIA

Abstract


We propose in this paper a new algorithm that, when called by existing token ring-based algorithms of parallel and distributed applications, easily renders the token tolerant to losses in presence of node crashes. At most k consecutive node crashes are tolerated in the ring. Our algorithm scales very well since a node monitors the liveness of at most k other nodes and neither a global election algorithm nor broadcast primitives are used to regenerate a new token. It is thus very effective in terms of latency cost. Finally, a study of the probability of having at most k consecutive node crashes in the presence of f failures and a discussion of how to extend our algorithm to other logical topologies are also presented.
Keywords: Silicon, Monitoring, Computer crashes, Nominations and elections, Fault tolerance, Fault tolerant systems, distributed algorithm, fault tolerance, token-based, scalability
Published
2013-10-23
ARANTES, Luciana; SOPENA, Julien. Easily Rendering Token-Ring Algorithms of Distributed and Parallel Applications Fault Tolerant. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 25. , 2013, Porto de Galinhas/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2013 . p. 206-213.