Simplifying log forensics analysis using Large Language Models with the RAG Technique
Abstract
In the digital age, the growing complexity of computer systems and the sophistication of cyberattacks significantly increase the volume of logs generated, posing challenges to cybersecurity professionals. Detecting and interpreting attacks or issues in these records are crucial for a swift response to security incidents. In this context, Large Language Models (LLMs) emerge as fundamental tools for understanding and generating natural language. This study presents an approach to analyze system and network logs, aiming to detect, correlate, and interpret anomalies using the Retrieval-Augmented Generation (RAG) technique with LLMs and interactions through targeted questions. The results demonstrate the effectiveness of the proposed approach in generating relevant insights and simplifying forensic analysis for professionals in the field.References
Adnan, K. and Akbar, R. (2019). An analytical study of information extraction from unstructured and multidimensional big data. Journal of Big Data, 6(1):1–38.
Ahmad, R., Alsmadi, I., Alhamdani, W., and Tawalbeh, L. (2023). Zero-day attack detection: a systematic literature review. Artificial Intelligence Review, 56(10):10733–10811.
Da Silva, E. H. M., dos Santos, E. M. F., de Barros Monteiro, M. L., Bezerra, S. L., and de Miranda, S. C. (2024). Chattcu: Inteligência artificial como assistente do auditor. Revista do TCU, 153:19–45.
Fan, H. and Qin, Y. (2018/05). Research on text classification based on improved tf-idf algorithm. In Proceedings of the 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018), pages 501–506. Atlantis Press.
Finardi, P., Avila, L., Castaldoni, R., Gengo, P., Larcher, C., Piau, M., Costa, P., and Caridá, V. (2024). The chronicles of rag: The retriever, the chunk and the generator.
Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, M., and Wang, H. (2024). Retrieval-augmented generation for large language models: A survey.
Ge, T., Jing, H., Wang, L., Wang, X., Chen, S.-Q., and Wei, F. (2024). In-context autoencoder for context compression in a large language model. In The Twelfth International Conference on Learning Representations.
Hadi, M. U., Qureshi, R., Shah, A., Irfan, M., Zafar, A., Shaikh, M. B., Akhtar, N., Wu, J., Mirjalili, S., et al. (2023). Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects. Authorea Preprints.
Hu, J., Zhou, Y., and Wang, J. (2024). Intrinsic evaluation of rag systems for deep-logic questions.
IBM (2023). What is langchain? Disponível em: [link]. Acesso em: 19 de Janeiro de 2025.
International Telecommunication Union (ITU) (2023). Global offline population steadily declines to 2.6 billion people in 2023. Disponível em: [link]. Acesso em: 03 de Janeiro de 2025.
Jeong, C. (2023). A study on the implementation of generative ai services using an enterprise data-based llm application architecture. Advances in Artificial Intelligence and Machine Learning, 03(04):1588–1618.
Jiang, H., Wu, Q., Lin, C.-Y., Yang, Y., and Qiu, L. (2023). LLMLingua: Compressing prompts for accelerated inference of large language models. In Bouamor, H., Pino, J., and Bali, K., editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13358–13376, Singapore. Association for Computational Linguistics.
Juvekar, K. and Purwar, A. (2024). Introducing a new hyper-parameter for rag: Context window utilization.
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., and Iwasawa, Y. (2022). Large language models are zero-shot reasoners. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A., editors, Advances in Neural Information Processing Systems, volume 35, pages 22199–22213. Curran Associates, Inc.
Kwon, D., Kim, H., Kim, J., Suh, S. C., Kim, I., and Kim, K. J. (2019). A survey of deep learning-based network anomaly detection. Cluster Computing, 22:949–961.
LangChain (2024). Select by similarity. Disponível em: [link]. Acesso em: 19 de Janeiro de 2025.
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., Riedel, S., and Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS ’20, Red Hook, NY, USA. Curran Associates Inc.
Li, X., Tang, H., Chen, S., Wang, Z., Maravi, A., and Abram, M. (2023). Context matters: Data-efficient augmentation of large language models for scientific applications.
Liu, F., Kang, Z., and Han, X. (2024). Optimizing rag techniques for automotive industry pdf chatbots: A case study with locally deployed ollama models.
Liu, Y., Iter, D., Xu, Y., Wang, S., Xu, R., and Zhu, C. (2023). G-eval: Nlg evaluation using gpt-4 with better human alignment.
Maryamah, M., Irfani, M. M., Tri Raharjo, E. B., Rahmi, N. A., Ghani, M., and Raharjana, I. K. (2024). Chatbots in academia: A retrieval-augmented generation approach for improved efficient information access. In 2024 16th International Conference on Knowledge and Smart Technology (KST), pages 259–264.
Melz, E. (2023). Enhancing llm intelligence with arm-rag: Auxiliary rationale memory for retrieval augmented generation.
Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Akhtar, N., Barnes, N., and Mian, A. (2024). A comprehensive overview of large language models.
Nayerifard, T., Amintoosi, H., Bafghi, A. G., and Dehghantanha, A. (2023). Machine learning in digital forensics: A systematic literature review.
Oliner, A., Ganapathi, A., and Xu, W. (2012). Advances and challenges in log analysis. Communications of the ACM, 55(2):55–61.
Padilha, R., Theóphilo, A., Andaló, F. A., Vega-Oliveros, D. A., Cardenuto, J. P., Bertocco, G., Nascimento, J., Yang, J., and Rocha, A. (2021). A inteligência artificial e os desafios da ciência forense digital no século xxi. Estudos Avançados, 35(101):113–138.
Petukhova, A., Matos-Carvalho, J. P., and Fachada, N. (2025). Text clustering with large language model embeddings. International Journal of Cognitive Computing in Engineering, 6:100–108.
Rahutomo, F., Kitasuka, T., Aritsugi, M., et al. (2012). Semantic cosine similarity. In The 7th international student conference on advanced science and technology ICAST, volume 4, page 1. University of Seoul South Korea.
Rau, D., Wang, S., Déjean, H., and Clinchant, S. (2024). Context embeddings for efficient answer generation in rag.
Sawarkar, K., Mangal, A., and Solanki, S. R. (2024). Blended rag: Improving rag (retriever-augmented generation) accuracy with semantic search and hybrid query-based retrievers.
Silva, E. M. D. and Avanço, L. (2024). Visibilidade em cibersegurança: Uma pesquisa exploratória. In 20th CONTECSI-INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGY MANAGEMENT VIRTUAL.
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., and Lample, G. (2023). Llama: Open and efficient foundation language models.
Vazquez, F. J. B. (2024). Política de resposta a incidentes cibernéticos e estratégias de aderência à legislação brasileira. Dataset Reports, 3(1):114–119.
Wang, Z., Liu, J., Zhang, S., and Yang, Y. (2024). Poisoned langchain: Jailbreak llms by langchain.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., ichter, b., Xia, F., Chi, E., Le, Q. V., and Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A., editors, Advances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc.
Yang, H., Zhang, M., Wei, D., and Guo, J. (2024). Srag: Speech retrieval augmented generation for spoken language understanding. In 2024 IEEE 2nd International Conference on Control, Electronics and Computer Technology (ICCECT), pages 370–374.
Yin, S., Fu, C., Zhao, S., Li, K., Sun, X., Xu, T., and Chen, E. (2024). A survey on multimodal large language models. National Science Review, 11(12).
Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., Yang, L., Zhang, W., Jiang, J., and Cui, B. (2024). Retrieval-augmented generation for ai-generated content: A survey.
Şakar, T. and Emekci, H. (2025). Maximizing rag efficiency: A comparative analysis of rag methods. Natural Language Processing, 31(1):1–25.
Ahmad, R., Alsmadi, I., Alhamdani, W., and Tawalbeh, L. (2023). Zero-day attack detection: a systematic literature review. Artificial Intelligence Review, 56(10):10733–10811.
Da Silva, E. H. M., dos Santos, E. M. F., de Barros Monteiro, M. L., Bezerra, S. L., and de Miranda, S. C. (2024). Chattcu: Inteligência artificial como assistente do auditor. Revista do TCU, 153:19–45.
Fan, H. and Qin, Y. (2018/05). Research on text classification based on improved tf-idf algorithm. In Proceedings of the 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018), pages 501–506. Atlantis Press.
Finardi, P., Avila, L., Castaldoni, R., Gengo, P., Larcher, C., Piau, M., Costa, P., and Caridá, V. (2024). The chronicles of rag: The retriever, the chunk and the generator.
Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, M., and Wang, H. (2024). Retrieval-augmented generation for large language models: A survey.
Ge, T., Jing, H., Wang, L., Wang, X., Chen, S.-Q., and Wei, F. (2024). In-context autoencoder for context compression in a large language model. In The Twelfth International Conference on Learning Representations.
Hadi, M. U., Qureshi, R., Shah, A., Irfan, M., Zafar, A., Shaikh, M. B., Akhtar, N., Wu, J., Mirjalili, S., et al. (2023). Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects. Authorea Preprints.
Hu, J., Zhou, Y., and Wang, J. (2024). Intrinsic evaluation of rag systems for deep-logic questions.
IBM (2023). What is langchain? Disponível em: [link]. Acesso em: 19 de Janeiro de 2025.
International Telecommunication Union (ITU) (2023). Global offline population steadily declines to 2.6 billion people in 2023. Disponível em: [link]. Acesso em: 03 de Janeiro de 2025.
Jeong, C. (2023). A study on the implementation of generative ai services using an enterprise data-based llm application architecture. Advances in Artificial Intelligence and Machine Learning, 03(04):1588–1618.
Jiang, H., Wu, Q., Lin, C.-Y., Yang, Y., and Qiu, L. (2023). LLMLingua: Compressing prompts for accelerated inference of large language models. In Bouamor, H., Pino, J., and Bali, K., editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13358–13376, Singapore. Association for Computational Linguistics.
Juvekar, K. and Purwar, A. (2024). Introducing a new hyper-parameter for rag: Context window utilization.
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., and Iwasawa, Y. (2022). Large language models are zero-shot reasoners. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A., editors, Advances in Neural Information Processing Systems, volume 35, pages 22199–22213. Curran Associates, Inc.
Kwon, D., Kim, H., Kim, J., Suh, S. C., Kim, I., and Kim, K. J. (2019). A survey of deep learning-based network anomaly detection. Cluster Computing, 22:949–961.
LangChain (2024). Select by similarity. Disponível em: [link]. Acesso em: 19 de Janeiro de 2025.
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., Riedel, S., and Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS ’20, Red Hook, NY, USA. Curran Associates Inc.
Li, X., Tang, H., Chen, S., Wang, Z., Maravi, A., and Abram, M. (2023). Context matters: Data-efficient augmentation of large language models for scientific applications.
Liu, F., Kang, Z., and Han, X. (2024). Optimizing rag techniques for automotive industry pdf chatbots: A case study with locally deployed ollama models.
Liu, Y., Iter, D., Xu, Y., Wang, S., Xu, R., and Zhu, C. (2023). G-eval: Nlg evaluation using gpt-4 with better human alignment.
Maryamah, M., Irfani, M. M., Tri Raharjo, E. B., Rahmi, N. A., Ghani, M., and Raharjana, I. K. (2024). Chatbots in academia: A retrieval-augmented generation approach for improved efficient information access. In 2024 16th International Conference on Knowledge and Smart Technology (KST), pages 259–264.
Melz, E. (2023). Enhancing llm intelligence with arm-rag: Auxiliary rationale memory for retrieval augmented generation.
Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Akhtar, N., Barnes, N., and Mian, A. (2024). A comprehensive overview of large language models.
Nayerifard, T., Amintoosi, H., Bafghi, A. G., and Dehghantanha, A. (2023). Machine learning in digital forensics: A systematic literature review.
Oliner, A., Ganapathi, A., and Xu, W. (2012). Advances and challenges in log analysis. Communications of the ACM, 55(2):55–61.
Padilha, R., Theóphilo, A., Andaló, F. A., Vega-Oliveros, D. A., Cardenuto, J. P., Bertocco, G., Nascimento, J., Yang, J., and Rocha, A. (2021). A inteligência artificial e os desafios da ciência forense digital no século xxi. Estudos Avançados, 35(101):113–138.
Petukhova, A., Matos-Carvalho, J. P., and Fachada, N. (2025). Text clustering with large language model embeddings. International Journal of Cognitive Computing in Engineering, 6:100–108.
Rahutomo, F., Kitasuka, T., Aritsugi, M., et al. (2012). Semantic cosine similarity. In The 7th international student conference on advanced science and technology ICAST, volume 4, page 1. University of Seoul South Korea.
Rau, D., Wang, S., Déjean, H., and Clinchant, S. (2024). Context embeddings for efficient answer generation in rag.
Sawarkar, K., Mangal, A., and Solanki, S. R. (2024). Blended rag: Improving rag (retriever-augmented generation) accuracy with semantic search and hybrid query-based retrievers.
Silva, E. M. D. and Avanço, L. (2024). Visibilidade em cibersegurança: Uma pesquisa exploratória. In 20th CONTECSI-INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGY MANAGEMENT VIRTUAL.
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., and Lample, G. (2023). Llama: Open and efficient foundation language models.
Vazquez, F. J. B. (2024). Política de resposta a incidentes cibernéticos e estratégias de aderência à legislação brasileira. Dataset Reports, 3(1):114–119.
Wang, Z., Liu, J., Zhang, S., and Yang, Y. (2024). Poisoned langchain: Jailbreak llms by langchain.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., ichter, b., Xia, F., Chi, E., Le, Q. V., and Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A., editors, Advances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc.
Yang, H., Zhang, M., Wei, D., and Guo, J. (2024). Srag: Speech retrieval augmented generation for spoken language understanding. In 2024 IEEE 2nd International Conference on Control, Electronics and Computer Technology (ICCECT), pages 370–374.
Yin, S., Fu, C., Zhao, S., Li, K., Sun, X., Xu, T., and Chen, E. (2024). A survey on multimodal large language models. National Science Review, 11(12).
Zhao, P., Zhang, H., Yu, Q., Wang, Z., Geng, Y., Fu, F., Yang, L., Zhang, W., Jiang, J., and Cui, B. (2024). Retrieval-augmented generation for ai-generated content: A survey.
Şakar, T. and Emekci, H. (2025). Maximizing rag efficiency: A comparative analysis of rag methods. Natural Language Processing, 31(1):1–25.
Published
2025-09-01
How to Cite
BARROS, Carlos G. L.; LIMA, João P. A.; ARRUDA, Alexandre; SOUSA, Rubens Abraão da Silva; BANDEIRA, Alan Portela.
Simplifying log forensics analysis using Large Language Models with the RAG Technique. In: BRAZILIAN SYMPOSIUM ON CYBERSECURITY (SBSEG), 25. , 2025, Foz do Iguaçu/PR.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 839-854.
DOI: https://doi.org/10.5753/sbseg.2025.10682.
