DifATo - Difusão Atômica Tolerante a Faltas Bizantinas Baseada em Tecnologia de Virtualização
Abstract
This paper presents a byzantine fault-tolerant protocol for atomic multicast whose algorithm manages to implement a reliable consensus service with only 2f + 1 servers to tolerate f faulty ones. For the creation of the algorithm we used common technologies such as virtualization and data sharing abstractions. The system model adopted is hybrid, meaning that the assumptions of synchrony and occurrence of faults consider each component separately. Moreover, in our model, we use two networks to provide the service, a payload network, where messages are exchanged between clients and servers, and a tamperproof network, where the messages are ordered.
References
Castro, M. and Liskov, B. (2002). Practical byzantine fault tolerance and proactive recovery. ACM Transactions on Computer Systems (TOCS), 20(4):398–461.
Chandra, T. D. and Toueg, S. (1996). Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225–267.
Correia, M., Neves, N. F., and Verissimo, P. (2004). How to tolerate half less one byzantine nodes in practical distributed systems. In Reliable Distributed Systems, 2004. Proceedings of the 23rd IEEE International Symposium on, pages 174–183. IEEE.
Correia, M., Neves, N. F., and Verı́ssimo, P. (2006). From consensus to atomic broadcast: Time-free byzantine-resistant protocols without signatures. The Computer Journal, 49(1):82–96.
Correia, M., Verissimo, P., and Neves, N. (2002). The design of a cots real-time distributed security kernel. Dependable Computing EDCC-4, pages 634–638.
Défago, X., Schiper, A., and Urbán, P. (2004). Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Computing Surveys (CSUR), 36(4):372–421.
Ekwall, R., Schiper, A., and Urbán, P. (2004). Token-based atomic broadcast using unreliable failure detectors. In Reliable Distributed Systems, 2004. Proceedings of the 23rd IEEE International Symposium on, pages 52–65. IEEE.
Favarim, F., Fraga, J. S., Lung, L. C., Correia, M., and Santos, J. F. (2007). Exploiting tuple spaces to provide fault-tolerant scheduling on computational grids. In Object and Component-Oriented Real-Time Distributed Computing, 2007. ISORC’07. 10th IEEE International Symposium on, pages 403–411. IEEE.
Guerraoui, R. and Rodrigues, L. (2006). Introduction to reliable distributed programming. Springer-Verlag New York Inc.
Guerraoui, R. and Schiper, A. (2001). The generic consensus service. Software Engineering, IEEE Transactions on, 27(1):29–41.
Jain, R. (1991). The art of computer systems performance analysis, volume 182. John Wiley & Sons Chichester.
Kemme, B., Pedone, F., Alonso, G., Schiper, A., and Wiesmann, M. (2003). Using optimistic atomic broadcast in transaction processing systems. Knowledge and Data Engineering, IEEE Transactions on, 15(4):1018–1032.
Lamport, L., Shostak, R., and Pease, M. (1982). The byzantine generals problem. ACM Transactions on Programming Languages and Systems (TOPLAS), 4(3):382–401.
Menezes, A. J., Van Oorschot, P. C., and Vanstone, S. A. (1996). Handbook of applied cryptography. CRC.
Obelheiro, R., Bessani, A., and Lung, L. (2005). Analisando a viabilidade da implementação prática de sistemas tolerantes a intrusões. In Anais do V Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais - SBSeg’05, pages 99–112. SBC.
Pieri, G., da Silva Fraga, J., and Lung, L. C. (2010). Consensus service to solve agreement problems. In Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on, pages 267–274. IEEE.
Reiter, M. K. (1994). Secure agreement protocols: Reliable and atomic group multicast in rampart. In Proceedings of the 2nd ACM Conference on Computer and Communications Security, pages 68–80. ACM.
Rodrigues, L., Verı́ssimo, P., and Casimiro, A. (1993). Using atomic broadcast to implement a posteriori agreement for clock synchronization. In Proceedings of the 12th Symposium on Reliable Distributed Systems - SRDS’93, pages 115–124. IEEE.
Schneider, F. B. (1990). Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv., 22(4):299–319.
Verı́ssimo, P. E. (2006). Travelling through wormholes: a new look at distributed systems models. ACM SIGACT News, 37(1):66–81.
Veronese, G. S., Correia, M., Bessani, A. N., and Lung, L. C. (2009). Spin one’s wheels? byzantine fault tolerance with a spinning primary. In Reliable Distributed Systems, 2009. SRDS’09. 28th IEEE International Symposium on, pages 135–144. IEEE.
Wangham, M. S., Lung, L. C., Westphall, C. M., and da Silva Fraga, J. (2001). Integrating ssl to the jacoweb security framework: Project and implementation. In Integrated Network Management’01, pages 779–792.
Yin, J., Martin, J., Venkataramani, A., Alvisi, L., and Dahlin, M. (2003). Separating agreement from execution for byzantine fault tolerant services. ACM SIGOPS Operating Systems Review, 37(5):253–267.
