Closing the Gap Between Lookahead and Checkpointing to Provide Hybrid Synchronization
Resumo
Hybrid synchronization provides more in-depth details about real distributed systems. However, several advances in algorithms to provide synchronization between local processes brings new difficulties to integrate into existing simulation architectures. This paper explores an alternative architecture to provide hybrid synchronization. We present optimistic and conservative synchronization primitives and design mechanisms to enable LPs to cooperate during the execution of a simulation. The results show that our primitives improve the simulation in terms of rollback-time and idleness.
Referências
Bosshart, P., Daly, D., Gibb, G., Izzard, M., McKeown, N., Rexford, J., Schlesinger, C., Talayco, D., Vahdat, A., Varghese, G., and Walker, D. (2014). P4: Program- ming protocol-independent packet processors. SIGCOMM Comput. Commun. Rev., 44(3):87–95.
Bouguerra, M.-S., Trystram, D., and Wagner, F. (2013). Complexity analysis of check- point scheduling with variable costs. IEEE Transactions on Computers, 62(6):1269– 1275.
Cai, W. and Turner, S. (1995). An algorithm for reducing null-messages of cmb approach in parallel discrete event simulation. Technical report.
Carvalho, F. M. M., M. B. A. (2015). Hybrid synchronization in the dcb based on unco- ordinated checkpoints. Proceedings of ESM’ 2015.
de Mello, B. A. and Wagner, F. R. (2002). A Standardized Co-simulation Backbone, pages 181–192. Springer US, Boston, MA.
Eldabi, T., Balaban, M., Brailsford, S., Mustafee, N., Nance, R. E., Onggo, B. S., and Sargent, R. G. (2016). Hybrid simulation: Historical lessons, present challenges and futures. In Proceedings of the 2016 Winter Simulation Conference, WSC ’16, pages 1388–1403, Piscataway, NJ, USA. IEEE Press.
Elnozahy, E. N., Alvisi, L., Wang, Y.-M., and Johnson, D. B. (2002). A survey of rollback- recovery protocols in message-passing systems. ACM Computing Surveys (CSUR), 34(3):375–408.
Fagin, R., Fagin, R., Fagin, R., and Halpern, J. Y. (1994). Reasoning about knowledge and probability. J. ACM, 41(2):340–367.
Fu, D., Becker, M., and Szczerbicka, H. (2013). On the potential of semi-conservative look-ahead estimation in approximative distributed discrete event simulation. In Pro- ceedings of the 2013 Summer Computer Simulation Conference, page 28. Society for Modeling & Simulation International.
Fujimoto, R. (2015). Parallel and distributed simulation. In Proceedings of the 2015 Winter Simulation Conference, WSC ’15, pages 45–59, Piscataway, NJ, USA. IEEE Press.
Fujimoto, R. M. (2001). Parallel and distributed simulation systems. Proceedings of the Winter Simulation Conference, pages 147–157.
Jefferson, D. (1990). Virtual time ii: Storage management in conservative and optimistic systems. In Proceedings of the Ninth Annual ACM Symposium on Principles of Dis- tributed Computing, PODC ’90, pages 75–89, New York, NY, USA. ACM.
Jefferson, D. R. (1985). Virtual time. ACM Transactions on Programming Languages and Systems (TOPLAS), 7(3):404–425.
Jefferson, D. R. and Barnes Jr, P. D. (2017). Virtual time iii: Unification of conservative and optimistic synchronization in parallel discrete event simulation. In Proceedings of the 2017 Winter Simulation Conference, page 55. IEEE Press.
Lamport, L. (1978). Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558–565.
Mikida, E. and Kale, L. (2019). An adaptive non-blocking gvt algorithm. In Proceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, SIGSIM-PADS ’19, pages 25–36, New York, NY, USA. ACM.
Netzer, R. H. B. and Xu, J. (1995). Necessary and sufficient conditions for consistent global snapshots. IEEE Trans. Parallel Distrib. Syst., 6(2):165–169.
Parizotto, R. and Mello, B. (2019). Uma abordagem para minimizar pontos de verificação inu ́teis em simulac ̧o ̃es otimistas distribu ́ıdas. In Anais do XLVI Semina ́rio Integrado de Software e Hardware, pages 12–21, Porto Alegre, RS, Brasil. SBC.
Pellegrini, A. and Quaglia, F. (2015). Numa time warp. In Proceedings of the 3rd ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, SIGSIM PADS ’15, pages 59–70, New York, NY, USA. ACM.
Quaglia, F. (1999). Combining periodic and probabilistic checkpointing in optimistic sim- ulation. In Proceedings Thirteenth Workshop on Parallel and Distributed Simulation. PADS 99.(Cat. No. PR00155), pages 109–116. IEEE.
Saker, S. and Agbaria, A. (2015). Communication pattern-based distributed snapshots in large-scale systems. In Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, IPDPSW ’15, pages 1062–1071, Wash- ington, DC, USA. IEEE Computer Society.
Souza, U. R. F., Sperb, J. K., de Mello, B. A., and Wagner, F. R. (2003). Tangram-virtual integration of heterogeneous ip components in a distributed co-simulation environ- ment. In 16th Symposium on Integrated Circuits and Systems Design, 2003. SBCCI 2003. Proceedings., pages 125–130. IEEE.
Steinman, J. S., Lee, C. A., Wilson, L. F., and Nicol, D. M. (1995). Global virtual time and distributed synchronization. In Proceedings of the Ninth Workshop on Parallel and Distributed Simulation, PADS ’95, pages 139–148, Washington, DC, USA. IEEE Computer Society.
Syriani, E., Vangheluwe, H., and Al Mallah, A. (2011). Modelling and simulation-based design of a distributed devs simulator. In Proceedings of the 2011 Winter Simulation Conference (WSC), pages 3002–3016. IEEE.