Framework para Verificação e Validação Experimental de Configurações de Rede Geradas por LLMs

Cristiano da Silveira Colombo; Magnos Martinello

doi:10.5753/sbrc.2026.19857

Cristiano da Silveira Colombo UFES / IFES
Magnos Martinello UFES

DOI: https://doi.org/10.5753/sbrc.2026.19857

Resumo

A utilização de Modelos de Linguagem de Grande Porte (LLMs) na geração de configurações e validação de políticas de rede tem recebido atenção crescente na literatura. Diferente da abordagem predominante que prioriza a acurácia da geração, este trabalho direciona seus esforços para o processo de verificação, avaliando a estabilidade sob carga (batch size) e a complexidade das intenções traduzidas de linguagem natural para requisitos técnicos. O artigo apresenta três contribuições : a proposição de uma arquitetura multi-estágio que integra a detecção de inconsistências lógicas à saída dos LLMs ; a identificação do limite de confiabilidade dos modelos em relação ao tamanho do lote de tarefas; e a demonstração de que a validade sintática é um indicador insuficiente para garantir a conectividade operacional em redes de computadores. Os resultados evidenciam que o volume de processamento simultâneo é determinante na degradação da corretude, sugerindo a existência de um ponto de inflexão crítico em torno de 20 tarefas por lote. Observou-se também que a precisão estrutural dos modelos é inversamente proporcional à densidade da janela de contexto. Este fenômeno indica que o LLM sofre uma diluição de foco, tendendo a omitir parâmetros lógicos essenciais, o que invalida a configuração no data plane.

Referências

Anwar, M. and Caesar, M. (2024). Understanding misunderstandings: Evaluating LLMs on networking questions. ACM SIGCOMM Computer Communication Review, 54(4):14–24. (Citado nas páginas 2, 3, 8, and 12.)

Aykurt, K., Blenk, A., and Kellerer, W. (2024). NetLLMBench: A benchmark framework for large language models in network configuration tasks. In Proceedings of the 2024 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), pages 1–6. IEEE. (Citado nas páginas 2, 3, and 4.)

Bhardwaj, A. and Fomina, S. (2025). Probably correct: Rethinking formal verification for LLM-driven systems. In Proceedings of the 2025 IEEE 12th International Conference on Cyber Security and Cloud Computing (CSCloud), pages 82–86. IEEE. (Citado na página 2.)

Bjorklund, M. (2016). The YANG 1.1 data modeling language. RFC 7950, RFC Editor. (Citado nas páginas 3 and 5.)

Dahlmann, T., Gmeiner, J., Krude, J., and Wehrle, K. (2024). NetConfEval: A multi-stage benchmark for network configuration generation with LLMs. ACM SIGCOMM Computer Communication Review, 54(2):16–25. (Citado nas páginas 2, 3, 4, 5, 6, 7, and 8.)

Hachimi, A., Cicco, N., Ibrahimi, M., Musumeci, F., and Tornatore, M. (2025). Flow-rule generation for SDN using LLMs with retry-based deployment validation. In Proceedings of the 21st International Conference on Network and Service Management (CNSM), pages 1–5. IFIP. (Citado nas páginas 2 and 4.)

Hollósi, G., Ficzere, D., and Varga, P. (2024). Generative AI for low-level NETCONF configuration in network management based on YANG models. In Proceedings of the 20th International Conference on Network and Service Management (CNSM), pages 1–7. IFIP. (Citado na página 3.)

Hossain, M. and Aljoby, W. (2025). NetIntent: Leveraging large language models for end-to-end intent-based SDN automation. IEEE Open Journal of the Communications Society, 6:10512–10541. (Citado nas páginas 2 and 4.)

IEEE (2012). IEEE standard for system, software, and hardware verification and validation. IEEE Std 1012-2012. (Citado na página 3.)

Jesus, Y. K. F., Martinello, M., and Zambon, E. (2016). DETOX: Detecção de inconsistências na política de segurança implementada em firewall real. In Anais do VII Workshop de Testes e Tolerância a Falhas (WTF), pages 63–76, Salvador, BA, Brazil. (Citado nas páginas 3, 5, and 9.)

Kou, S., Yang, C., and Gurusamy, M. (2025). GIA: LLM-enabled generative intent abstraction to enhance adaptability for intent-driven networks. IEEE Transactions on Cognitive Communications and Networking, 11(2):999–1012. (Citado nas páginas 2, 3, and 4.)

Lantz, B., Heller, B., and McKeown, N. (2010). A network in a laptop: Rapid prototyping for software-defined networks. In Proceedings of the 9th ACM Workshop on Hot Topics in Networks (HotNets-IX), pages 1–6, New York, NY, USA. ACM. (Citado nas páginas 4 and 5.)

Liu, N. F., Lin, K., and Chen, e. (2024). Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics, 12:157–173. (Citado na página 12.)

Long, S., Tan, J., Mao, B., Tang, F., Li, Y., Zhao, M., and Kato, N. (2025). A survey on intelligent network operations and performance optimization based on large language models. IEEE Communications Surveys & Tutorials, 27(6):3915–3949. (Citado na página 3.)

Mondal, R., Bjorner, N., Millstein, T., Tang, A., and Varghese, G. (2025). Tackling ambiguity in user intent for LLM-based network configuration synthesis. In Proceedings of the 24th ACM Workshop on Hot Topics in Networks (HotNets), pages 176–183. ACM. (Citado na página 3.)

Tu, N., Nam, S., and Hong, J. W.-K. (2024). Intent-based network configuration using large language models. International Journal of Network Management, 35(1). (Citado na página 2.)

Wang, Z., Cornacchia, A., Galante, F., Centofanti, C., Sacco, A., and Jiang, D. (2025). Towards a playground to democratize experimentation and benchmarking of AI agents for network troubleshooting. In Proceedings of the 1st Workshop on Next-Generation Network Observability, pages 1–3. ACM. (Citado na página 2.)

Zheng, N., Li, F., Li, Z., Yang, Y., Hao, Y., Liu, C., and Wang, X. (2024). Configtrans: Network configuration translation based on large language models and constraint solving. In Proceedings of the 32nd IEEE International Conference on Network Protocols (ICNP), pages 1–12. IEEE. (Citado nas páginas 2, 3, and 4.)

Zhou, Y., Hsieh, K., Mani, S., Kandula, S., and Liu, Z. (2025). MeshAgent: Enabling reliable network management with large language models. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 9(3):1–36. (Citado na página 2.)

Framework para Verificação e Validação Experimental de Configurações de Rede Geradas por LLMs

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)