Consenso com Recuperação no Modelo Partitioned Synchronous
Resumo
O modelo síncrono particionado (partioned synchronous foi introduzido para tirar proveito de partições síncronas em sistemas distribuídos híbridos, uma vez que estas são implementáveis em muitos cenários reais. No presente artigo apresentamos pela primeira vez um algoritmo para consenso com recuperação de processos, e respectivas provas formais, adequado ao modelo síncrono particionado. O algoritmo proposto tem como principal vantagem a capacidade de tolerar n-k defeitos de processos, onde k é o número de partições síncronas e n o tonal de processos no sistema - sendo que podem existir processos que não fazem parte de partições síncronas. Em particular, a robustez do protocolo se aplica mesmo se a maioria dos processos não estiver em partições síncronas, o que representa uma vantagem em termos de robustez quando comparado com soluções para modelos convencionais.Referências
Aguilera, M. K., Chen, W., and Toueg, S. (1998). Failure detection and consensus in the crash-recovery model. In DISC ’98: Proceedings of the 12th International Symposium on Distributed Computing, pages 231–245.
Chandra, T. D. and Toueg, S. (1996). Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225–267.
Cristian, F. and Fetzer, C. (1999). The timed asynchronous distributed system model. IEEE Transactions on Parallel and Distributed Systems, 10(6):642–657.
Dwork, C., Lynch, N., and Stockmeyer, L. (1988). Consensus in the presence of partial synchrony. Journal of the ACM, 35(2):288–323.
Fisher, M. J., Lynch, N., and Paterson, M. S. (1985). Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2):374–382.
Freiling, F. C., Lambertz, C., and Majster-Cederbaum, M. (2008). Easy consensus algorithms for the crash-recovery model. In DISC ’08: Proceedings of the 22nd international symposium on Distributed Computing, pages 507–508, Berlin, Heidelberg. Springer-Verlag.
Gorender, S. and Macêdo, R. J. A. (2002). Um modelo para tolerância a falhas em sistemas distribuídos com qos. In Anais do Simpósio Brasileiro de Redes de Computadores, SBRC 2002, pages 277–292.
Gorender, S., Macêdo, R. J. A., and Raynal, M. (2007). An adaptive programming model for fault-tolerant distributed computing. IEEE Transactions on Dependable and Secure Computing, 4(1):18–31.
Lamport, L. (1998). The part time parliament. ACM Trans. on Computer Systems, 16(2):133–169.
Lamport, L. (2001). Paxos made simple. Distributed Computing Column of ACM SIGACT News, 32(4):51–58.
Lynch, N. A. (1996). Distributed Algorithms. Morgan Kaufmann Publishers, Inc.
Macêdo, R. J. A. (2007). An integrated group communication infraestructure for hybrid real-time distributed systems. In 9th Workshop on Real-Time Systems, pages 81–88.
Macêdo, R. J. A. and Freitas, A. (2009). A generic group communication approach for hybrid distributed systems. (5523(4)):102–115.
Macêdo, R. J. A. and Gorender, S. (2008). Detectores perfeitos em sistemas distribuídos não síncronos. In IX Workshop de Teste e Tolerância a Falhas (WTF 2008), Rio de Janeiro, Brazil.
Macêdo, R. J. A. and Gorender, S. (2009). Perfect failure detection in the partitioned synchronous distributed system model. In Proceedings of the The Fourth International Conference on Availability, Reliability and Security (ARES 2009), IEEE CS Press. To appear in an extended version in Int. Journal of Critical Computer-Based Systems (IJCCBS).
Macêdo, R. J. A., Gorender, S., and Raynal, M. (2005). A qos-based adaptive model for fault-tolerant distributed computing (with an application to consensus). In Proceedings of IEEE/IFIP Int. Conference on Computer Systems and Networks (DNS05), pages 412–421.
Veríssimo, P. and Casimiro, A. (2002). The ti mely computing base model and architecture. IEEE Transactions on Computers, 51(8):916–930.
Chandra, T. D. and Toueg, S. (1996). Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225–267.
Cristian, F. and Fetzer, C. (1999). The timed asynchronous distributed system model. IEEE Transactions on Parallel and Distributed Systems, 10(6):642–657.
Dwork, C., Lynch, N., and Stockmeyer, L. (1988). Consensus in the presence of partial synchrony. Journal of the ACM, 35(2):288–323.
Fisher, M. J., Lynch, N., and Paterson, M. S. (1985). Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2):374–382.
Freiling, F. C., Lambertz, C., and Majster-Cederbaum, M. (2008). Easy consensus algorithms for the crash-recovery model. In DISC ’08: Proceedings of the 22nd international symposium on Distributed Computing, pages 507–508, Berlin, Heidelberg. Springer-Verlag.
Gorender, S. and Macêdo, R. J. A. (2002). Um modelo para tolerância a falhas em sistemas distribuídos com qos. In Anais do Simpósio Brasileiro de Redes de Computadores, SBRC 2002, pages 277–292.
Gorender, S., Macêdo, R. J. A., and Raynal, M. (2007). An adaptive programming model for fault-tolerant distributed computing. IEEE Transactions on Dependable and Secure Computing, 4(1):18–31.
Lamport, L. (1998). The part time parliament. ACM Trans. on Computer Systems, 16(2):133–169.
Lamport, L. (2001). Paxos made simple. Distributed Computing Column of ACM SIGACT News, 32(4):51–58.
Lynch, N. A. (1996). Distributed Algorithms. Morgan Kaufmann Publishers, Inc.
Macêdo, R. J. A. (2007). An integrated group communication infraestructure for hybrid real-time distributed systems. In 9th Workshop on Real-Time Systems, pages 81–88.
Macêdo, R. J. A. and Freitas, A. (2009). A generic group communication approach for hybrid distributed systems. (5523(4)):102–115.
Macêdo, R. J. A. and Gorender, S. (2008). Detectores perfeitos em sistemas distribuídos não síncronos. In IX Workshop de Teste e Tolerância a Falhas (WTF 2008), Rio de Janeiro, Brazil.
Macêdo, R. J. A. and Gorender, S. (2009). Perfect failure detection in the partitioned synchronous distributed system model. In Proceedings of the The Fourth International Conference on Availability, Reliability and Security (ARES 2009), IEEE CS Press. To appear in an extended version in Int. Journal of Critical Computer-Based Systems (IJCCBS).
Macêdo, R. J. A., Gorender, S., and Raynal, M. (2005). A qos-based adaptive model for fault-tolerant distributed computing (with an application to consensus). In Proceedings of IEEE/IFIP Int. Conference on Computer Systems and Networks (DNS05), pages 412–421.
Veríssimo, P. and Casimiro, A. (2002). The ti mely computing base model and architecture. IEEE Transactions on Computers, 51(8):916–930.
Publicado
28/05/2010
Como Citar
GORENDER, Sérgio; MACÊDO, Raimundo.
Consenso com Recuperação no Modelo Partitioned Synchronous. In: WORKSHOP DE TESTES E TOLERÂNCIA A FALHAS (WTF), 11. , 2010, Gramado/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2010
.
p. 3-16.
ISSN 2595-2684.
DOI: https://doi.org/10.5753/wtf.2010.23092.