Eleição de Líder com Qualidade de Serviço para o Modelo Falha-e-Recuperação
Abstract
Distributed systems use unreliable failure detectors to encapsulate the abstraction of time and to determine which processes have currently failed. Very often, these detectors are the basis for the creation of a leader election service. Many works are dedicated to analyze the quality of service (QoS) of failure detectors, but only few of them has analyzed the QoS of a leader election algorithm. In this work, we present the NFD-L leader election algorithm, designed to work on crash-recovery distributed systems and to follow the QoS specification defined by [Chen et al. 2002]. We used NFD-L to elect Paxos coordinators for a replication framework and compared the observed QoS for NFD-L with the behavior of the framework native leader election algorithm that is not designed to explicitly meet any QoS requirement.
References
Chen, W., Toueg, S., and Aguilera, M. (2002). On the quality of service of failure detectors. Computers, IEEE Transactions on, 51(5):561–580.
Fischer, M. J., Lynch, N. A., and Paterson, M. S. (1985). Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374–382.
Garcia-Molina, H. (1982). Elections in a distributed computing system. IEEE Trans. Comput., 31(1):48–59.
Guerraoui, R. (2000). Indulgent algorithms (preliminary version). In Proceedings of the Nineteenth Annual ACM Symposium on Principles of Distributed Computing, PODC ’00, pages 289–297, New York, NY, USA. ACM.
Lamport, L. (1998). The part-time parliament. ACM Trans. Comput. Syst., 16(2):133–169.
Ma, T., Hillston, J., and Anderson, S. (2010). On the quality of service of crash-recovery failure detectors. IEEE Trans. Dependable Secur. Comput., 7(3):271–283.
Nunes, R. C. and Jansch-Porto, I. (2004). QoS of timeout-based self-tuned failure detec tors: The effects of the communication delay predictor and the safety margin. In Proceedings of the 2004 International Conference on Dependable Systems and Networks, DSN ’04, pages 753–, Washington, DC, USA. IEEE Computer Society.
Reis, V. A. (2017). Eleição de líder com qualidade de serviço para o modelo falha-erecuperação. Master’s thesis, Universidade Federal de São Carlos, Sorocaba, Brasil.
Reis, V. A. and Vieira, G. M. D. (2017). Quality of service of an asynchronous crashrecovery leader election algorithm. In SBRC ’17: Proc. of the 35th Brazilian Symposium on Computer Networks and Distributed Systems, pages 1089–1102, Belém, Brazil.
Schiper, N. and Toueg, S. (2008). A robust and lightweight stable leader election service for dynamic systems. In Dependable Systems and Networks With FTCS and DCC, 2008. DSN 2008. IEEE International Conference on, pages 207–216. IEEE.
Sotoma, I. andMadeira, E. R.M. (2006). A markov model for providing quality of service for failure detectors under message loss bursts. Technical Report IC-06-013, Institute of Computing, University of Campinas.
Vieira, G. M. D. and Buzato, L. E. (2008). Treplica: Ubiquitous replication. In SBRC ’08: Proc. of the 26th Brazilian Symposium on Computer Networks and Distributed Systems, Rio de Janeiro, Brasil.
