Replicação Máquina de Estados Paralelas com Escalonamento Híbrido

Aldênio Burgos; Eduardo Alchieri

doi:10.5753/ctd.2022.222821

Aldênio Burgos UnB
Eduardo Alchieri UnB

DOI: https://doi.org/10.5753/ctd.2022.222821

Resumo

Replicação Máquina de Estados (RME) é uma abordagem muito utilizada na implementação de sistemas tolerantes a falhas e consiste em replicar os servidores e fazer com que os mesmos executem deterministicamente, e na mesma ordem, o mesmo conjunto de requisições. RMEs paralelas utilizam escalonadores que tiram proveito da semântica das requisições e permitem a execução paralela de algumas delas, melhorando assim o desempenho das aplicações. No escalonamento tardio as requisições são escalonadas para execução após serem ordenadas, enquanto que no escalonamento antecipado parte das decisões de escalonamento são realizadas antes da ordenação e respeitadas durante a execução. Este trabalho propõe um protocolo de escalonamento híbrido que tira proveito das vantagens de cada uma das abordagens anteriores. Experimentos mostram que o escalonador híbrido supera as outras abordagens em diversos cenários. Além disso, através da implementação de um sistema de pagamento, este trabalho mostra que a abordagem proposta minimiza limitações de desempenho anteriormente estudadas na implementação de blockchains que utilizam uma RME como bloco subjacente de construção.

Palavras-chave: Replicação Máquina de Estados Paralela, Escalonamento, Dependabilidade

Referências

Abraham, I., Malkhi, D., Nayak, K., Ren, L., and Spiegelman, A. (2017). Solida: A Blockchain Protocol Based on Reconfigurable Byzantine Consensus. In Proceedings of the 21st International Conference on Principles of Distributed Systems.

Alchieri, E., Dotti, F., Mendizabal, O. M., and Pedone, F. (2017). Reconfiguring parallel state machine replication. In SRDS.

Alchieri, E., Dotti, F., and Pedone, F. (2018). Early scheduling in parallel state machine replica. In ACM SoCC.

Bessani, A., Alchieri, E., Sousa, J., Oliveira, A., and Pedone, F. (2020). From byzantine replication to blockchain: Consensus is only the beginning. In 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

Bessani, A., Sousa, J. a., and Alchieri, E. E. P. (2014). State machine replication for the masses with bft-smart. In International Conf. on Dependable Systems and Networks.

Burgos, A., Alchieri, E., and Dotti, F. (2021). On the performance of using parallel state machine replication to implement blockchains. In 2021 10th Latin-American Symposium on Dependable Computing (LADC), pages 1–6.

Burgos, A., Alchieri, E., Dotti, F., and Pedone, F. (2022). Exploiting concurrency in sharded parallel state machine replication. IEEE Transactions on Parallel and Distributed Systems, 33(9):2133–2147.

Cachin, C. and Vukolic, M. (2017). Blockchain consensus protocol in the wild (invited paper). In Proceedings of the 31th International Symposium on Distributed Computing.

Castro, M. and Liskov, B. (1999). Practical Byzantine fault tolerance. In 3rd Symposium on Operating Systems Design and Implementation. USENIX Association.

Castro, M. and Liskov, B. (2002). Practical byzantine fault-tolerance and proactive recovery. ACM Transactions on Computer Systems, 20(4):398–461.

Cui, H., Gu, R., Liu, C., Chen, T., and Yang, J. (2015). Paxos made transparent. In 25th Symposium on Operating Systems Principles.

Dwork, C., Lynch, N. A., and Stockmeyer, L. (1988). Consensus in the presence of partial synchrony. Journal of ACM, 35(2):288–322.

Escobar, I. A., Alchieri, E., Dotti, F. L., and Pedone, F. (2019). Boosting concurrency in parallel state machine replication. In Proc. 20th International Middleware Conference.

Guo, Z., Hong, C., Yang, M., Zhou, D., Zhou, L., and Zhuang, L. (2014). Rex: Replication at the speed of multi-core. In European Conference on Computer Systems.

Habiger, G., Hauck, F. J., Reiser, H. P., and Köstler, J. (2020). Self-optimising applicationagnostic multithreading for replicated state machines. In SRDS.

Herlihy, M. and Wing, J. M. (1990). Linearizability: A correctness condition for concurrent objects. ACM Trans. on Programing Languages and Systems, 12(3):463–492.

Kapritsos, M., Wang, Y., Quema, V., Clement, A., Alvisi, L., and Dahlin, M. (2012). All about eve: execute-verify replication for multi-core servers. In OSDI.

Kotla, R. and Dahlin, M. (2004a). High throughput byzantine fault tolerance. In IEEE/IFIP Int. Conference on Dependable Systems and Networks.

Kotla, R. and Dahlin, M. (2004b). High throughput byzantine fault tolerance. In DSN.

Lamport, L. (1998). The part-time parliament. ACM Transactions on Computer Systems, 16(2):133–169.

Marandi, P. J., Bezerra, C. E., and Pedone, F. (2014a). Rethinking state machine replication for parallelism. In ICDCS.

Marandi, P. J., Bezerra, C. E. B., and Pedone, F. (2014b). Rethinking state-machine replication for parallelism. In ICDCS.

Marandi, P. J. and Pedone, F. (2014). Optimistic parallel state-machine replication. In SRDS.

Mendizabal, O. M., Moura, R. T. S., Dotti, F. L., and Pedone, F. (2017). Efficient and deterministic scheduling for parallel state machine replication. In IPDPS.

Pass, R. and Shi, E. (2017). Hybrid Consensus: Efficient Consensus in the Permissionless Model. In Proceedings of the 31st International Symposium on Distributed Computing.

Schneider, F. B. (1990). Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys, 22(4):299–319.