Data Reconciliation in DHT Networks
Resumo
Optimistic replication can provide high data availability for collaborative applications in large scale distributed systems (grid, P2P, and mobile systems). However, if data reconciliation is performed by a single node, data availability remains an important issue since the reconciler node can fail. Thus, reconciliation should also be distributed and reconciliation data should be replicated. We have previously proposed the DSR-cluster algorithm, a distributed version of the IceCube semantic reconciliation engine designed for cluster networks. However DSR-cluster is not suitable for P2P networks, which are usually built on top of the Internet. In this case, network costs must be considered. The main contribution of this paper is the DSR-P2P algorithm, a distributed reconciliation algorithm designed for P2P networks. We first propose a P2P-DHT cost model for computing communication costs in a DHT overlay network. Second, taking into account this model, we propose a cost model for computing the cost of each reconciliation step. Third, we propose an algorithm that dynamically selects the best nodes for each reconciliation step. Our algorithm yields high data availability with acceptable performance and limited overhead.
Referências
BRITE, http://www.cs.bu.edu/brite/.
Clarke, I., Miller, S., Hong, T.W., Sandberg, O., and Wiley, B. Protecting Free Expression Online with Freenet. IEEE Internet Computing, 6(1), 2002.
Howell, F. and McNab, R. SimJava: a discrete event simulation package for Java with applications in computer systems modeling. In Web-based Modeling and Simulation, 1998.
Huebsch, R., Hellerstein, J. M., Lanham, N., Stoica, I., Loo, B. T., and Shenker, S. Querying the Internet with PIER. Proc. of VLDB Conference, 2003.
Kermarrec, A-M, Rowstron, A, Shapiro, M and Druschel P. The IceCube approach to the reconciliation of diverging replicas. Proc. of ACM PODC, 2001.
Knezevic, P., Wombacher, A., and Risse, T. Enabling High Data Availability in a DHT. Proc. of GLOBE, 2005.
Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., and Zhao, B. Ocean-Store: An Architecture for Global-Scale Persistent Storage. Proc. of ASPLOS, 2000.
Martins, V., Pacitti, E., and Valduriez, P. A Dynamic Distributed Algorithm for Semantic Reconciliation. Distributed Data & Structures 7 (WDAS), 2006.
Martins, V., Pacitti, E., and Valduriez, P. Distributed Semantic Reconciliation of Replicated Data. Proc of CDUR, 2005.
Pacitti, E. and Dedieu, O. Algorithms for optimistic replication on the Web. Journal of the Brazilian Computing Society, 8(2), 2002.
Pacitti, E. and Simon, E. Update propagation strategies to improve freshness in lazy master replicated databases. The VLDB Journal, 8(3-4), 2000.
Preguiça, N, Shapiro, M and Matheson, C. Semantics-based reconciliation for collaborative and mobile environments. Proc. of IFCIS CoopIS, 2003.
Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. A scalable content-addressable network. Proc. of ACM SIGCOMM, 2001.
Saito, Y. and Shapiro, M. Optimistic Replication. ACM Computing Surveys, 37(1), 2005.
Stoica, I, Morris, R, Karger, D. R, Kaashoek, M. F and Balakrishnan, H. Chord: A scalable peer-to-peer lookup service for internet applications. Proc. of ACM SIGCOMM, 2001.
Whittaker, S., Issacs, e., and O’Day, V. Widening the Net: Workshop report on the theory and practice of physical and network communities. ACM SIGCHI Bulletin, 29(3), 1997.