Designing a Configurable Group Service with Agreement Components
Resumo
In the recent past years, many group toolkits, providing a support for the construction of reliable applications, have been designed. Even if the goals of their designers was similar, these toolkits differ in (i) the way the problems are tackled and (ii) the way the protocols are structured and set up in real systems. This paper presents the underlying design principles of a group system. It follows two innovative approaches which contribute to its flexible and configurable character. From an algorithmic point of view, the group primitives are implemented as instances of a generic consensus service. This choice leads to a great number of advantages: (a) the computation in the group is guaranteed as soon as a quorum of entities can communicate, (b) decoupling the group membership service from the communication service, (c) a better control of the dysfunctions in periods of strong network instability. From an architectural point of view, the elementary group protocols are regarded as autonomous agreement components, which can be combined freely for the implementation of other richer reliable services. This strategy differs from the classical ones in which the protocols are structured according to a fixed hierarchy of classes following a stack-based pattern of interaction.Referências
K. Birman, Reliable Distributed Computing with the ISIS Toolkit, ch. Virtual Synchrony. Los Alamitos, CA: IEEE Computer Society Press, 1994.
R. van Renesse, K. Birman, and S. Maffeis, “Horus: a flexible group communication system,” Communications of the ACM, vol. 39, pp. 76–83, Apr. 1996.
M. Hayden, The Ensemble System. PhD thesis, Cornell University, 1998.
K. P. Birman, R. van Renesse, and W. Vogels, “Spinglass: Secure and scalable communications tools for mission-critical computing,” in International Survivability Conference and Exposition. DARPA DISCEX-2001, (Anaheim, California), June 2001.
E. Anceaume, B. Charron-Bost, P. Minet, and S. Toueg, “On the formal specification of group membership services,” Tech. Rep. TR95-1534, Depto of Computer Science, Cornell University, Aug. 1995.
G. V. Chockler, I. Keidar, and R. Vitenberg, “Group communication specifications: a comprehensive study,” ACM Computing Surveys, vol. 33, pp. 427–469, Dec. 2001.
M. Fischer, N. Lynch, and M. Paterson, “Impossibility of distributed consensus with one faulty process,” Journal of ACM, vol. 32, pp. 374–382, Apr. 1985.
T. D. Chandra, V. Hadzilacos, S. Toueg, and B. Charron-Bost, “On the impossibility of group membership,” in Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC’96), (New York, USA), pp. 322–330, ACM, May 1996.
T. Chandra and S. Toueg, “Unreliable failure detectors for reliable distributed systems,” Journal of ACM, vol. 43, pp. 225–267, Mar. 1996.
R. Guerraoui and A. Schiper, “The generic consensus service,” IEEE Transactions on Software Engineering, vol. 27, pp. 29–41, Jan. 2001.
B. Garbinato, Protocol Objects & Patterns for Structuring Reliable Distributed Systems. PhD thesis, École Polytechnique Fédérale de Lausanne, Switzerland, 1998.
F. G. P. Greve, Réponses efficaces au besoin d’accord dans un groupe. PhD thesis, IRISA - Université de Rennes I, France, Nov. 2002.
F. Greve, , M. Hurfin, J.-P. L. Narzul, M. Xiaojung, and F. Tronel, “A library of agreement components for reliable distributed programming,” in Workshop on Communication Abstractions for Distributed Systems, with ACM ECOOP - 17th European Conference on Object-Oriented Programming, (Darmstadt, Germany), 2003.
F. Greve, , M. Hurfin, and J.-P. L. Narzul, “Adam: une bibliothèque de composants d’accord pour la programmation d’applications fiables,” in Actes des Journ ées Composants, (Lille, France), Mar. 2004.
F. Brasileiro, F. Greve, M. Hurfin, J.-P. L. Narzul, and F. Tronel, “Eva: an event based framework for developing specialised communication protocols,” in NCA 2001: IEEE International Symposium on Network Computing and Applications, pp. 108–119, Feb. 2002.
F. Cristian, “Reaching agreement on processor group membership in synchronous distributed systems,” Distributed Computing, vol. 4, pp. 175–187, Apr. 1991.
V. Hadzilacos and S. Toueg, Distributed Systems, ch. Fault Tolerant Broadcasts and Related Problems, pp. 97–145. Addison-Wesley, 1993.
K. M. Chandy and J. Misra, “How processes learn,” Distributed Computing, vol. 1, pp. 40–52, 1986.
S. Maffeis, “Electra—making distributed programs object-oriented,” in Proc. of the Usenix Symp. on Experiences with Distributed and Multiprocessor Systems, (San Diego, CA (USA)), pp. 143–156, 1993.
Y. Ren, AQUA: A Framework for Providing Adaptive Fault Tolerance to Distributed Applications. PhD thesis, University of Illinois, Urbana, 2001.
L. Moser, P. Melliar-Smith, D. Agarwal, R. Budhia, and C. Lingley-Papadopoulos, “Totem: a fault-tolerant multicast group communication system,” Communications of the ACM, vol. 39, pp. 54–63, Apr. 1996.
D. Dolev and D. Malki, “The transis approach to high availability cluster communication,” Communications of the ACM, vol. 39, pp. 64–70, Apr. 1996.
C. Malloth, Conception and Implementation of a Toolkit for Building Fault-Tolerant Distributed Applications in Large Scale Networks. PhD thesis, Ecole Polytechnique Fédérale de Lausanne, 1996.
P. Felber, The CORBA Object Group Service: A Service Approach to Object Groups in CORBA. PhD thesis, École Polytechnique Fédérale de Lausanne, Switzerland, 1998.
S. Mishra, L. Peterson, and R. Schlichting, “Consul: a communication substrate for fault-tolerant distributed programs,” Distributed Systems Engineering Journal, vol. 1, no. 2, pp. 87–103, 1993.
N. T. Bhatti, M. A. Hiltunen, R. D. Schlichting, and W. Chiu, “Coyote a system for constructing fine-grain configurable communication services,” ACM Transactions on Computer Systems, vol. 16, Nov. 1998.
M. A. Hiltunen and R. D. Schlichting, “The cactus approach to building configurable middleware services,” in Proc. of the Workshop on Dependable System Middleware and Group Communication (DSMGC 2000), Oct. 2000.
F. Greve, , M. Hurfin, J.-P. L. Narzul, M. Xiaojung, and F. Tronel, “A library of agreement components for reliable distributed programming,” in Workshop on Communication Abstractions for Distributed Systems, with ACM ECOOP - 17th European Conference on Object-Oriented Programming, (Darmstadt, Germany), 2003.
M. Hurfin, R. Macêdo, M. Raynal, and F. Tronel, “A generic framework to solve agreement problems,” in Proc. of the 19th IEEE Symposium on Reliable Distributed Systems (SRDS’99), (Lausanne, Switzerland), pp. 56–65, Oct. 1999.
X. Liu, C. Kreitz, R. van Renesse, J. Hickey, M. Hayden, K. Birman, and R. Constable, “Building reliable, high-performance communication systems from components,” in Proc. of the 17th ACM Symposium on Operating Systems Principles (SOSP’99), (Charleston, USA), pp. 80–92, Dec. 1999.
F. Greve, M. Hurfin, M. Raynal, and F. Tronel, “Primary component asynchronous group membership as an instance of a generic agreement framework,” in ISADS’2001: 5th International Symposium on Autonomous Decentralized Systems, pp. 93–100, Mar. 2001.
R. van Renesse, K. Birman, and S. Maffeis, “Horus: a flexible group communication system,” Communications of the ACM, vol. 39, pp. 76–83, Apr. 1996.
M. Hayden, The Ensemble System. PhD thesis, Cornell University, 1998.
K. P. Birman, R. van Renesse, and W. Vogels, “Spinglass: Secure and scalable communications tools for mission-critical computing,” in International Survivability Conference and Exposition. DARPA DISCEX-2001, (Anaheim, California), June 2001.
E. Anceaume, B. Charron-Bost, P. Minet, and S. Toueg, “On the formal specification of group membership services,” Tech. Rep. TR95-1534, Depto of Computer Science, Cornell University, Aug. 1995.
G. V. Chockler, I. Keidar, and R. Vitenberg, “Group communication specifications: a comprehensive study,” ACM Computing Surveys, vol. 33, pp. 427–469, Dec. 2001.
M. Fischer, N. Lynch, and M. Paterson, “Impossibility of distributed consensus with one faulty process,” Journal of ACM, vol. 32, pp. 374–382, Apr. 1985.
T. D. Chandra, V. Hadzilacos, S. Toueg, and B. Charron-Bost, “On the impossibility of group membership,” in Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing (PODC’96), (New York, USA), pp. 322–330, ACM, May 1996.
T. Chandra and S. Toueg, “Unreliable failure detectors for reliable distributed systems,” Journal of ACM, vol. 43, pp. 225–267, Mar. 1996.
R. Guerraoui and A. Schiper, “The generic consensus service,” IEEE Transactions on Software Engineering, vol. 27, pp. 29–41, Jan. 2001.
B. Garbinato, Protocol Objects & Patterns for Structuring Reliable Distributed Systems. PhD thesis, École Polytechnique Fédérale de Lausanne, Switzerland, 1998.
F. G. P. Greve, Réponses efficaces au besoin d’accord dans un groupe. PhD thesis, IRISA - Université de Rennes I, France, Nov. 2002.
F. Greve, , M. Hurfin, J.-P. L. Narzul, M. Xiaojung, and F. Tronel, “A library of agreement components for reliable distributed programming,” in Workshop on Communication Abstractions for Distributed Systems, with ACM ECOOP - 17th European Conference on Object-Oriented Programming, (Darmstadt, Germany), 2003.
F. Greve, , M. Hurfin, and J.-P. L. Narzul, “Adam: une bibliothèque de composants d’accord pour la programmation d’applications fiables,” in Actes des Journ ées Composants, (Lille, France), Mar. 2004.
F. Brasileiro, F. Greve, M. Hurfin, J.-P. L. Narzul, and F. Tronel, “Eva: an event based framework for developing specialised communication protocols,” in NCA 2001: IEEE International Symposium on Network Computing and Applications, pp. 108–119, Feb. 2002.
F. Cristian, “Reaching agreement on processor group membership in synchronous distributed systems,” Distributed Computing, vol. 4, pp. 175–187, Apr. 1991.
V. Hadzilacos and S. Toueg, Distributed Systems, ch. Fault Tolerant Broadcasts and Related Problems, pp. 97–145. Addison-Wesley, 1993.
K. M. Chandy and J. Misra, “How processes learn,” Distributed Computing, vol. 1, pp. 40–52, 1986.
S. Maffeis, “Electra—making distributed programs object-oriented,” in Proc. of the Usenix Symp. on Experiences with Distributed and Multiprocessor Systems, (San Diego, CA (USA)), pp. 143–156, 1993.
Y. Ren, AQUA: A Framework for Providing Adaptive Fault Tolerance to Distributed Applications. PhD thesis, University of Illinois, Urbana, 2001.
L. Moser, P. Melliar-Smith, D. Agarwal, R. Budhia, and C. Lingley-Papadopoulos, “Totem: a fault-tolerant multicast group communication system,” Communications of the ACM, vol. 39, pp. 54–63, Apr. 1996.
D. Dolev and D. Malki, “The transis approach to high availability cluster communication,” Communications of the ACM, vol. 39, pp. 64–70, Apr. 1996.
C. Malloth, Conception and Implementation of a Toolkit for Building Fault-Tolerant Distributed Applications in Large Scale Networks. PhD thesis, Ecole Polytechnique Fédérale de Lausanne, 1996.
P. Felber, The CORBA Object Group Service: A Service Approach to Object Groups in CORBA. PhD thesis, École Polytechnique Fédérale de Lausanne, Switzerland, 1998.
S. Mishra, L. Peterson, and R. Schlichting, “Consul: a communication substrate for fault-tolerant distributed programs,” Distributed Systems Engineering Journal, vol. 1, no. 2, pp. 87–103, 1993.
N. T. Bhatti, M. A. Hiltunen, R. D. Schlichting, and W. Chiu, “Coyote a system for constructing fine-grain configurable communication services,” ACM Transactions on Computer Systems, vol. 16, Nov. 1998.
M. A. Hiltunen and R. D. Schlichting, “The cactus approach to building configurable middleware services,” in Proc. of the Workshop on Dependable System Middleware and Group Communication (DSMGC 2000), Oct. 2000.
F. Greve, , M. Hurfin, J.-P. L. Narzul, M. Xiaojung, and F. Tronel, “A library of agreement components for reliable distributed programming,” in Workshop on Communication Abstractions for Distributed Systems, with ACM ECOOP - 17th European Conference on Object-Oriented Programming, (Darmstadt, Germany), 2003.
M. Hurfin, R. Macêdo, M. Raynal, and F. Tronel, “A generic framework to solve agreement problems,” in Proc. of the 19th IEEE Symposium on Reliable Distributed Systems (SRDS’99), (Lausanne, Switzerland), pp. 56–65, Oct. 1999.
X. Liu, C. Kreitz, R. van Renesse, J. Hickey, M. Hayden, K. Birman, and R. Constable, “Building reliable, high-performance communication systems from components,” in Proc. of the 17th ACM Symposium on Operating Systems Principles (SOSP’99), (Charleston, USA), pp. 80–92, Dec. 1999.
F. Greve, M. Hurfin, M. Raynal, and F. Tronel, “Primary component asynchronous group membership as an instance of a generic agreement framework,” in ISADS’2001: 5th International Symposium on Autonomous Decentralized Systems, pp. 93–100, Mar. 2001.
Publicado
10/05/2004
Como Citar
GREVE, Fabíola Gonçalves Pereira; LE NARZUL, Jean-Pierre.
Designing a Configurable Group Service with Agreement Components. In: WORKSHOP DE TESTES E TOLERÂNCIA A FALHAS (WTF), 5. , 2004, Gramado/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2004
.
p. 121-132.
ISSN 2595-2684.
DOI: https://doi.org/10.5753/wtf.2004.23385.