Benchmark TPC-C Aplicado em Replicação Máquina de Estados

  • Kayel L. Serafim PUCRS
  • Eduardo Alchieri UnB
  • Fernando Dotti PUCRS

Abstract


State Machine Replication (SMR) is an important approach to providing highly availabile services. Due to its deterministic model, scaling throughput in SMR is a challenge. However, there is a lack of common workloads to evaluate SMR, which are representative to significant application classes. We identify important and common aspects in the context of online transaction processing and discuss the use of such workloads to evaluate SMR. Due to its acceptance, we propose the use of benchmark C from the Transaction Processing Performance Committee, called TPC-C, for SMR evaluation. We discuss its architecture for the context of SMR and implemented it on a replication platform. Results for the sequential SMR model are reported. Moreover, a parallel SMR approach is discussed for TPC-C, implemented and results reported.

References

Alchieri, E., Dotti, F., and Pedone, F. (2018). Early scheduling in parallel state machine replica. In ACM SoCC.

Attiya, H. and Welch, J. (2004). Distributed Computing: Fundamentals, Simulations, and Advanced Topics. Wiley-Interscience.

Batista, E., Alchieri, E., Dotti, F., and Pedone, F. (2022). Early scheduling on steroids: Boosting parallel state machine replication. Journal of Parallel and Distributed Computing, 163:269–282.

Bessani, A., Sousa, J. a., and Alchieri, E. E. P. (2014). State machine replication for the masses with bft-smart. In Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN ’14, pages 355–362, Washington, DC, USA. IEEE Computer Society.

Bezerra, C. E., Pedone, F., and Renesse, R. V. (2014). Scalable state-machine replication. In DSN, pages 331–342.

Burgos, A., Alchieri, E., Dotti, F., and Pedone, F. (2022). Exploiting concurrency in sharded parallel state machine replication. IEEE Transactions on Parallel and Distributed Systems, 33(9):2133–2147.

Burrows, M. (2006). The chubby lock service for loosely-coupled distributed systems. In OSDI.

Castro, M. and Liskov, B. (2002). Practical byzantine fault-tolerance and proactive recovery. ACM Transactions on Computer Systems, 20(4):398–461.

Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. (2008). Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26(2):1–26.

Cowling, J. and Liskov, B. (2012). Granola: Low-Overhead distributed transaction coordination. In USENIX Annual Technical Conference. USENIX Association.

Escobar, I., Alchieri, E., Dotti, F., and Pedone, F. (2019). Boosting concurrency in parallel state machine replication. In Middleware.

Hadzilacos, V. and Toueg, S. (1994). A modular approach to the specification and implementation of fault-tolerant broadcasts. Technical report, Department of Computer Science, Cornell.

Herlihy, M. P. and Wing, J. M. (1990). Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS), 12(3):463–492.

Hunt, P., Konar, M., Junqueira, F. P., and Reed, B. (2010). Zookeeper: wait-free coordination for internet-scale systems. In ATC, volume 8.

J. C. Corbett, J. D. and et al, M. E. (2012). Spanner: Google’s globally distributed database. In OSDI.

Kapritsos, M., Wang, Y., Quema, V., Clement, A., Alvisi, L., and Dahlin, M. (2012). All about eve: execute-verify replication for multi-core servers. In OSDI.

Kotla, R. and Dahlin, M. (2004). High throughput byzantine fault tolerance. In DSN.

Lakshman, A. and Malik, P. (2010). Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review, 44(2):35–40.

Lamport, L. (1978). Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558–565.

Le, L. H. (2020). Scaling State Machine Replication. PhD thesis, Università della Svizzera italiana.

Li, Z., Van Roy, P., and Romano, P. (2017). Enhancing throughput of partially replicated state machines via multi-partition operation scheduling. In 2017 IEEE 16th International Symposium on Network Computing and Applications (NCA), pages 1–10.

Marandi, P. J., Bezerra, C. E. B., and Pedone, F. (2014). Rethinking state-machine replication for parallelism. In ICDCS.

Marandi, P. J. and Pedone, F. (2014). Optimistic parallel state-machine replication. In SRDS.

Mendizabal, O. M., Moura, R. T. S., Dotti, F. L., and Pedone, F. (2017). Efficient and deterministic scheduling for parallel state machine replication. In IPDPS.

Pedone, F., Alchieri, E., Dotti, F., Marandi, P., and Mendizabal, O. (2018). Boosting state machine replication with parallel execution. In Anais do VIII Latin-American Symposium on Dependable Computing, pages 77–86, Porto Alegre, RS, Brasil. SBC.

Schiper, N., Sutra, P., and Pedone, F. (2010). P-store: Genuine partial replication in wide area networks. In Symposium on Reliable Distributed Systems (SRDS).

Schneider, F. B. (1990). Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys, 22(4):299–319.

Shvachko, K., Kuang, H., Radia, S., and Chansler, R. (2010). The hadoop distributed file system. In MSST.

Thomson, A., Diamond, T., Weng, S.-C., Ren, K., Shao, P., and Abadi, D. J. (2012). Calvin: Fast distributed transactions for partitioned database systems. In Proceedings of ACM SIGMOD International Conference on Management of Data, SIGMOD ’12.

Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E., and Wilkes, J. (2015). Large-scale cluster management at google with borg. In EuroSys.
Published
2023-05-26
SERAFIM, Kayel L.; ALCHIERI, Eduardo; DOTTI, Fernando. Benchmark TPC-C Aplicado em Replicação Máquina de Estados. In: FAULT TOLERANCE WORKSHOP (WTF), 24. , 2023, Brasília/DF. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 121-134. ISSN 2595-2684. DOI: https://doi.org/10.5753/wtf.2023.803.