Structured Extraction of Vulnerabilities in OpenVAS and Tenable WAS Reports Using LLMs
Resumo
This paper proposes an automated LLM-based method to extract and structure vulnerabilities from OpenVAS and Tenable WAS scanner reports, converting unstructured data into a standardized format for risk management. In an evaluation using a report with 34 vulnerabilities, GPT-4.1 and DeepSeek achieved the highest similarity to the baseline (ROUGE-L greater than 0.7). The method demonstrates feasibility in transforming complex reports into usable datasets, enabling effective prioritization and future anonymization of sensitive data.Referências
Chen, B. (2025). Unleashing the potential of prompt engineering for large language models. Artificial Intelligence Review.
Fabacher, M., Meyer, S., and Lang, A. (2025). Efficient extraction of medication information from clinical notes: An evaluation in two languages. arXiv preprint arXiv:2502.03257.
Greenbone (2025). Openvas report - user manual. [link].
Hu, Y., Li, J., and Wang, H. (2025). Large language model driven transferable key information extraction. Scientific Reports. Online; Acesso em 10 nov. 2025.
Li, X., Zhou, M., and Tang, Y. (2024). Enhancing visual information extraction with large language models. In Intelligent Data Engineering and Automated Learning. Springer.
Rede Nacional de Ensino e Pesquisa (2018). Relatório anual de segurança de 2017. Technical report, Rede Nacional de Ensino e Pesquisa, Brasil.
Rede Nacional de Ensino e Pesquisa (2024). Relatório anual de segurança de 2023. Technical report, Rede Nacional de Ensino e Pesquisa, Brasil.
Tenable (2025). Tenable web app scanning user guide. [link].
Yan, T., Zhang, P., and Xu, L. (2025). Docextractnet: A novel framework for enhanced document information extraction. Information Processing & Management.
Zhong, Z., Li, Y., and Zhang, J. (2024). Enhancing multimodal large language models with multi-instance visual prompt generator for visual representation enrichment. Amazon Science.
Fabacher, M., Meyer, S., and Lang, A. (2025). Efficient extraction of medication information from clinical notes: An evaluation in two languages. arXiv preprint arXiv:2502.03257.
Greenbone (2025). Openvas report - user manual. [link].
Hu, Y., Li, J., and Wang, H. (2025). Large language model driven transferable key information extraction. Scientific Reports. Online; Acesso em 10 nov. 2025.
Li, X., Zhou, M., and Tang, Y. (2024). Enhancing visual information extraction with large language models. In Intelligent Data Engineering and Automated Learning. Springer.
Rede Nacional de Ensino e Pesquisa (2018). Relatório anual de segurança de 2017. Technical report, Rede Nacional de Ensino e Pesquisa, Brasil.
Rede Nacional de Ensino e Pesquisa (2024). Relatório anual de segurança de 2023. Technical report, Rede Nacional de Ensino e Pesquisa, Brasil.
Tenable (2025). Tenable web app scanning user guide. [link].
Yan, T., Zhang, P., and Xu, L. (2025). Docextractnet: A novel framework for enhanced document information extraction. Information Processing & Management.
Zhong, Z., Li, Y., and Zhang, J. (2024). Enhancing multimodal large language models with multi-instance visual prompt generator for visual representation enrichment. Amazon Science.
Publicado
08/12/2025
Como Citar
MACHADO, Beatriz; LAUTERT, Douglas; KAPELINSKI, Cristhian; KREUTZ, Diego.
Structured Extraction of Vulnerabilities in OpenVAS and Tenable WAS Reports Using LLMs. In: ESCOLA REGIONAL DE REDES DE COMPUTADORES (ERRC), 22. , 2025, Porto Alegre/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 144-150.
DOI: https://doi.org/10.5753/errc.2025.17776.