Uma Arquitetura de Alta Disponibilidade para Serviços Virtualizados de Rede
Abstract
Network Functions Virtualization (NFV) allows the implementation in software of services that run in the network core. Complex services can be formed by composing multiple VNFs (Virtual Network Functions) in SFCs (Service Function Chains). It is essential to guarantee the reliability of these services, since they are responsible for critical tasks. This work proposes the NHAM (NFV High Availability Module): a high availability architecture for virtualized network services. NHAM is proposed as a module of the NFV-MANO architecture and guarantees the continuous availability of stateful SFCs. NHAM performs fault management and offers a choice of different recovery mechanisms which are applied according to specific service requirements. The high-availability strategy combines SFC buffer management with Checkpoint/Restore. A prototype was implemented and experimental results are presented.
References
Cotroneo, D. et al. (2014). Network function virtualization: Challenges and directions for reliability assurance. In IEEE International Symp. on Software Reliability Engineering Workshops, pages 37–42.
CRIU (2019). Checkpoint/Restore In Userspace. https://criu.org/.
Cully, B. et al. (2008). Remus: High availability via asynchronous virtual machine replication. In Proceedings of the 5th USENIX Symp. on NSDI, pages 161–174.
Elnozahy, E. et al. (2002). A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys (CSUR), 34(3):375–408.
Gember-Jacobson, A. et al. (2014). Opennf: Enabling innovation in network function control. In ACM SIGCOMM Computer Communication Review, pages 163–174. ACM.
Halpern, J. and Pignataro, C. (2015). Service Function Chaining (SFC) Architecture. RFC 7665, IETF.
Han, B. et al. (2017). On the resiliency of virtual network functions. IEEE Communications Magazine, 55(7):152–157.
Kablan, M. et al. (2017). Stateless network functions: Breaking the tight coupling of state and processing. In 14th USENIX Symp. on NSDI, pages 97–112.
Khalid, J. and Akella, A. (2019). Correctness and performance for stateful chained network functions. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19), pages 501–516.
Kulkarni, S. et al. (2018). Reinforce: Achieving efficient failure resiliency for network function virtualization based services. In Proceedings of the 14th International CoNEXT, pages 41–53. ACM.
Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239):2.
Mijumbi, R. et al. (2016). Network function virtualization: State-of-the-art and research challenges. IEEE Communications Surveys & Tutorials, 18(1):236–262.
Nakamura, H. et al. (2016). Network Functions Virtualisation (NFV); Reliability; Report on Models and Features for End-to-End Reliability. GS NFV-REL 003. Technical report, ETSI.
Quittek, J. et al. (2014). Network Functions Virtualisation (NFV); Management and Orchestration. GS NFV-MAN V1.1.1. Technical report, ETSI.
Rajagopalan, S. et al. (2013). Pico replication: A high availability framework for middleboxes. In Proceedings of the 4th annual Symposium on Cloud Computing, pages 1–15.
Schöller, M. et al. (2015). Network Function Virtualisation (NFV); Resiliency Requirements. GS NFV-REL 001. Technical report, ETSI.
Sherry, J. et al. (2015). Rollback-recovery for middleboxes. In ACM SIGCOMM Computer Communication Review, pages 227–240. ACM.
Venâncio, G. et al. (2020). Uma arquitetura de alta disponibilidade para funções virtualizadas de rede. In Anais do XXXVIII SBRC, pages 407–420. SBC.
