Eleição de Líder com o vCube para Sistemas Distribuídos Crash-Recovery no Modelo GST
Resumo
A eleição de líder é um dos problemas fundamentais de sistemas distribuídos, com um número enorme de aplicações. Neste artigo apresentamos um algoritmo hierárquico e autonômico para eleição de líder baseado na topologia virtual vCube. O vCube é redefinido com um detector de falhas para o modelo crash-recovery, considerando que os processos têm acesso a memória não volátil. O algoritmo elege como líder aquele processo correto entre os mais estáveis (que falharam/recuperaram menos vezes) que tem menor identificador. A correção do algoritmo é discutida, em termos das propriedades de precisão e acordo após um tempo. O algoritmo foi implementado com simulação e resultados comparando com a versão força-bruta demonstram sua escalabilidade.Referências
Aguilera, M. K., Chen, W., and Toueg, S. (2000). Failure detection and consensus in the crash-recovery model. Distributed computing, 13(2):99–125.
Biswas, A., Maurya, A. K., Tripathi, A. K., and Aknine, S. (2021). Frlle: a failure rate and load-based leader election algorithm for a bidirectional ring in distributed systems. The Journal of Supercomputing, 77:751–779.
Biswas, A. and Tripathi, A. K. (2021). Preselection based leader election in distributed systems. In International Symposium on Intelligent and Distributed Computing, pages 261–271. Springer.
Bona, L. C., Duarte Jr, E. P., Mello, S. L., and Fonseca, K. V. (2006). Hyperbone: Uma rede overlay baseada em hipercubo virtual sobre a internet. XXIV SBRC.
Cachin, C., Guerraoui, R., and Rodrigues, L. (2011). Introduction to reliable and secure distributed programming. Springer Science & Business Media.
Chandra, T. D. and Toueg, S. (1996). Unreliable failure detectors for reliable distributed systems. J. ACM, 43(2):225–267.
Duarte, E. P., Albini, L. C., Brawerman, A., and Guedes, A. L. (2009). A hierarquical distributed fault diagnosis algorithm based on clusters with detours. In The 6th IEEE Latin American Network Operations and Management Symposium, pages 1–6. IEEE.
Duarte, E. P., Bona, L. C. E., and Ruoso, V. K. (2014). Vcube: A provably scalable distributed diagnosis algorithm. In 2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pages 17–22.
Duarte, E. P., Garrett, T., Bona, L. C., Carmo, R., and Züge, A. P. (2010). Finding stable cliques of planetlab nodes. In DSN 2010, pages 317–322. IEEE.
Duarte, E. P. and Nanya, T. (1998). A hierarchical adaptive distributed system-level diagnosis algorithm. IEEE Transactions on computers, 47(1):34–45.
Duarte Jr, E. P. and Cestari, J. M. A. (2000). O agente chinês para diagnóstico de redes de topologia arbitrária. In Anais do II WTF, pages 88–93. SBC.
Duarte Jr, E. P., Rodrigues, L. A., Camargo, E. T., and Turchetti, R. C. (2023). The missing piece: a distributed system-level diagnosis model for the implementation of unreliable failure detectors. Computing, 105(12):2821–2845.
Fernández-Campusano, C., Larrea, M., Cortiñas, R., and Raynal, M. (2017). A distributed leader election algorithm in crash-recovery and omissive systems. Information Processing Letters, 118:100–104.
Gómez-Calzado, C., Larrea, M., Soraluze, I., Lafuente, A., and Cortiñas, R. (2013). An evaluation of efficient leader election algorithms for crash-recovery systems. In 2013 21st Euromicro, pages 180–188.
Jeanneau, D., Rodrigues, L. A., Arantes, L., and Jr., E. P. D. (2017). An autonomic hierarchical reliable broadcast protocol for asynchronous distributed systems with failure detection. J. Braz. Comput. Soc., 23(1):15:1–15:14.
Kumar, N., Pathan, A.-S. K., Duarte, E. P., and Shaikh, R. A. (2015). Critical applications in vehicular ad hoc/sensor networks. Telecommunication Systems, 58:275–277.
Rodrigues, L. A. (2006). Extensão do suporte para simulação de defeitos em algoritmos distribuídos utilizando o neko. Master’s thesis, UFRGS.
Rodrigues, L. A., Arantes, L., and Duarte, E. P. (2016). An autonomic majority quorum system. In 2016 IEEE 30th AINA, pages 524–531. IEEE.
Rodrigues, L. A., Duarte, E. P., and Arantes, L. (2018a). A distributed k-mutual exclusion algorithm based on autonomic spanning trees. JPDC, 115:41–55.
Rodrigues, L. A., Duarte, E. P., de Araujo, J. P., Arantes, L., and Sens, P. (2018b). Bundling messages to reduce the cost of tree-based broadcast algorithms. In 2018 8th LADC, pages 115–124. IEEE.
Ruchel, L. V., de Camargo, E. T., Rodrigues, L. A., Turchetti, R. C., Arantes, L., and Duarte Jr, E. P. (2024). Scalable atomic broadcast: A leaderless hierarchical algorithm. JPDC, 184:104789.
Santoro, N. (2006). Design and analysis of distributed algorithms. John Wiley & Sons.
Stein, G., Rodrigues, L. A., Duarte Jr, E. P., and Arantes, L. (2023). Diamond-p-vcube: An eventually perfect hierarchical failure detector for asynchronous distributed systems. In Proc. 12th LADC, pages 40–49.
Urban, P., Defago, X., and Schiper, A. (2000). Contention-aware metrics for distributed algorithms: comparison of atomic broadcast algorithms. In 9th Int’l Conf.on Computer Communications and Networks (ICCCN), pages 582–589.
Urban, P., Defago, X., and Schiper, A. (2001). Neko: a single environment to simulate and prototype distributed algorithms. In 15th Int’l Conf. Info. Networking, pages 503–511.
Wang, J. and Gupta, I. (2023). Churn-tolerant leader election protocols. In 43rd IEEE ICDCS, pages 96–107. IEEE.
Biswas, A., Maurya, A. K., Tripathi, A. K., and Aknine, S. (2021). Frlle: a failure rate and load-based leader election algorithm for a bidirectional ring in distributed systems. The Journal of Supercomputing, 77:751–779.
Biswas, A. and Tripathi, A. K. (2021). Preselection based leader election in distributed systems. In International Symposium on Intelligent and Distributed Computing, pages 261–271. Springer.
Bona, L. C., Duarte Jr, E. P., Mello, S. L., and Fonseca, K. V. (2006). Hyperbone: Uma rede overlay baseada em hipercubo virtual sobre a internet. XXIV SBRC.
Cachin, C., Guerraoui, R., and Rodrigues, L. (2011). Introduction to reliable and secure distributed programming. Springer Science & Business Media.
Chandra, T. D. and Toueg, S. (1996). Unreliable failure detectors for reliable distributed systems. J. ACM, 43(2):225–267.
Duarte, E. P., Albini, L. C., Brawerman, A., and Guedes, A. L. (2009). A hierarquical distributed fault diagnosis algorithm based on clusters with detours. In The 6th IEEE Latin American Network Operations and Management Symposium, pages 1–6. IEEE.
Duarte, E. P., Bona, L. C. E., and Ruoso, V. K. (2014). Vcube: A provably scalable distributed diagnosis algorithm. In 2014 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, pages 17–22.
Duarte, E. P., Garrett, T., Bona, L. C., Carmo, R., and Züge, A. P. (2010). Finding stable cliques of planetlab nodes. In DSN 2010, pages 317–322. IEEE.
Duarte, E. P. and Nanya, T. (1998). A hierarchical adaptive distributed system-level diagnosis algorithm. IEEE Transactions on computers, 47(1):34–45.
Duarte Jr, E. P. and Cestari, J. M. A. (2000). O agente chinês para diagnóstico de redes de topologia arbitrária. In Anais do II WTF, pages 88–93. SBC.
Duarte Jr, E. P., Rodrigues, L. A., Camargo, E. T., and Turchetti, R. C. (2023). The missing piece: a distributed system-level diagnosis model for the implementation of unreliable failure detectors. Computing, 105(12):2821–2845.
Fernández-Campusano, C., Larrea, M., Cortiñas, R., and Raynal, M. (2017). A distributed leader election algorithm in crash-recovery and omissive systems. Information Processing Letters, 118:100–104.
Gómez-Calzado, C., Larrea, M., Soraluze, I., Lafuente, A., and Cortiñas, R. (2013). An evaluation of efficient leader election algorithms for crash-recovery systems. In 2013 21st Euromicro, pages 180–188.
Jeanneau, D., Rodrigues, L. A., Arantes, L., and Jr., E. P. D. (2017). An autonomic hierarchical reliable broadcast protocol for asynchronous distributed systems with failure detection. J. Braz. Comput. Soc., 23(1):15:1–15:14.
Kumar, N., Pathan, A.-S. K., Duarte, E. P., and Shaikh, R. A. (2015). Critical applications in vehicular ad hoc/sensor networks. Telecommunication Systems, 58:275–277.
Rodrigues, L. A. (2006). Extensão do suporte para simulação de defeitos em algoritmos distribuídos utilizando o neko. Master’s thesis, UFRGS.
Rodrigues, L. A., Arantes, L., and Duarte, E. P. (2016). An autonomic majority quorum system. In 2016 IEEE 30th AINA, pages 524–531. IEEE.
Rodrigues, L. A., Duarte, E. P., and Arantes, L. (2018a). A distributed k-mutual exclusion algorithm based on autonomic spanning trees. JPDC, 115:41–55.
Rodrigues, L. A., Duarte, E. P., de Araujo, J. P., Arantes, L., and Sens, P. (2018b). Bundling messages to reduce the cost of tree-based broadcast algorithms. In 2018 8th LADC, pages 115–124. IEEE.
Ruchel, L. V., de Camargo, E. T., Rodrigues, L. A., Turchetti, R. C., Arantes, L., and Duarte Jr, E. P. (2024). Scalable atomic broadcast: A leaderless hierarchical algorithm. JPDC, 184:104789.
Santoro, N. (2006). Design and analysis of distributed algorithms. John Wiley & Sons.
Stein, G., Rodrigues, L. A., Duarte Jr, E. P., and Arantes, L. (2023). Diamond-p-vcube: An eventually perfect hierarchical failure detector for asynchronous distributed systems. In Proc. 12th LADC, pages 40–49.
Urban, P., Defago, X., and Schiper, A. (2000). Contention-aware metrics for distributed algorithms: comparison of atomic broadcast algorithms. In 9th Int’l Conf.on Computer Communications and Networks (ICCCN), pages 582–589.
Urban, P., Defago, X., and Schiper, A. (2001). Neko: a single environment to simulate and prototype distributed algorithms. In 15th Int’l Conf. Info. Networking, pages 503–511.
Wang, J. and Gupta, I. (2023). Churn-tolerant leader election protocols. In 43rd IEEE ICDCS, pages 96–107. IEEE.
Publicado
24/05/2024
Como Citar
RODRIGUES, Luiz A.; FREITAS, Allan; DUARTE JR., Elias P..
Eleição de Líder com o vCube para Sistemas Distribuídos Crash-Recovery no Modelo GST. In: WORKSHOP DE TESTES E TOLERÂNCIA A FALHAS (WTF), 25. , 2024, Niterói/RJ.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 71-84.
ISSN 2595-2684.
DOI: https://doi.org/10.5753/wtf.2024.3294.