Uma Arquitetura de Alta Disponibilidade para Funções Virtualizadas de Rede
Resumo
A virtualização vem revolucionando a forma como as redes são construídas e gerenciadas. Em especial, funções de rede implementadas em hardware dedicado podem ser substituídas por Virtual Network Functions (VNFs), obtidas inclusive em marketplaces na Internet. Entretanto, é inquestionável que as VNFs apresentam maior susceptibilidade a falhas. Este trabalho propõe uma arquitetura de alta disponibilidade para VNFs, englobando o gerenciamento de falhas e diversas estratégias de recuperação. Como as VNFs executam em ambientes virtualizados, para VNFs stateful é possível copiar todo o seu estado, uma estratégia atraente e que não exige modificações no código interno da VNF. A estratégia é baseada em Checkpoint/Restore e a arquitetura foi projetada de acordo com à arquitetura de referência NFV-MANO. Um protótipo foi implementado como prova de conceito e resultados experimentais são apresentados.
Referências
Bondan, L., Franco, M. F., Marcuzzo, L., Venancio, G., Santos, R. L., Pfitscher, R. J., Scheid, E. J., Stiller, B., De Turck, F., Duarte, E. P., et al. (2019). Fende: Marketplace-based distribution, execution, and life cycle management of vnfs. IEEE Communications Magazine, 57(1):13–19.
Chiosi, M., Clarke, D., Willis, P., Reid, A., Feger, J., Bugenhagen, M., Khan, W., Fargano, M., Cui, C., Deng, H., et al. (2012). Network functions virtualisation: An introduction, benefits, enablers, challenges and call for action. In SDN and OpenFlow World Congress, pages 22–24.
Cotroneo, D., De Simone, L., Iannillo, A. K., Lanzaro, A., Natella, R., Fan, J., and Ping, W. (2014). Network function virtualization: Challenges and directions for reliability assurance. In 2014 IEEE International Symposium on Software Reliability Engineering Workshops, pages 37–42. IEEE.
CRIU (2019). Checkpoint/Restore In Userspace. https://criu.org/. Dezembro de 2019.
Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., and Warfield, A. (2008). Remus: High availability via asynchronous virtual machine replication. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, pages 161–174. San Francisco.
Elnozahy, E. N., Alvisi, L., Wang, Y.-M., and Johnson, D. B. (2002). A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys (CSUR), 34(3):375–408.
Gember-Jacobson, A., Viswanathan, R., Prakash, C., Grandl, R., Khalid, J., Das, S., and Akella, A. (2014). Opennf: Enabling innovation in network function control. In ACM SIGCOMM Computer Communication Review, pages 163–174. ACM.
Gray, J. and Siewiorek, D. P. (1991). High-availability computer systems. Computer, 24(9):39–48.
Han, B., Gopalakrishnan, V., Ji, L., and Lee, S. (2015). Network function virtualization: Challenges and opportunities for innovations. IEEE Communications Magazine, 53(2):90–97.
Han, B., Gopalakrishnan, V., Kathirvel, G., and Shaikh, A. (2017). On the resiliency of virtual network functions. IEEE Communications Magazine, 55(7):152–157.
Kablan, M., Alsudais, A., Keller, E., and Le, F. (2017). Stateless network functions: Breaking the tight coupling of state and processing. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 97–112.
Kshemkalyani, A. D. and Singhal, M. (2011). Distributed computing: principles, algorithms, and systems. Cambridge University Press.
Kulkarni, S. G., Liu, G., Ramakrishnan, K., Arumaithurai, M., Wood, T., and Fu, X. (2018). Reinforce: Achieving efficient failure resiliency for network function virtualization based services. In Proceedings of the 14th International Conference on emerging Networking EXperiments and Technologies, pages 41–53. ACM.
Li, W., Kanso, A., and Gherbi, A. (2015). Leveraging linux containers to achieve high availability for cloud services. In 2015 IEEE International Conference on Cloud Engineering, pages 76–83. IEEE.
Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239):2.
Mijumbi, R., Serrat, J., Gorricho, J.-L., Bouten, N., De Turck, F., and Boutaba, R. (2016). Network function virtualization: State-of-the-art and research challenges. IEEE Communications Surveys & Tutorials, 18(1):236–262.
Nakamura, H., Adams, R., and et al (2016). Network Functions Virtualisation (NFV); Reliability; Report on Models and Features for End-to-End Reliability. GS NFV-REL 003 V1.1.1. Technical report, ETSI.
OpenStack (2019). OpenStack - open source software for creating private and public clouds.
Quittek, J., Bauskar, P., BenMeriem, T., Bennett, A., Besson, M., and et al (2014). Network Functions Virtualisation (NFV); Management and Orchestration. GS NFV-MAN 001. Technical report, ETSI.
Rajagopalan, S., Williams, D., Jamjoom, H., and Warfield, A. (2013). Split/merge: System support for elastic execution in virtual middleboxes. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 227–240.
Schöller, M., Khan, N., and et al (2015). Network Function Virtualisation (NFV); Resiliency Requirements. GS NFV-REL 001 V1.1.1. Technical report, ETSI.
Sherry, J., Gao, P. X., Basu, S., Panda, A., Krishnamurthy, A., Maciocco, C., Manesh, M., Martins, J., Ratnasamy, S., Rizzo, L., et al. (2015). Rollback-recovery for middleboxes. In ACM SIGCOMM Computer Communication Review, pages 227–240. ACM.