Validation of Image Content Using LLMs and RAGs: A Strategy for Ensuring Compliance in Software Testing

  • Wendell Marques Sidia Institute of Science and Technology
  • Caina Oliveira Sidia Institute of Science and Technology
  • Carol Fernandes Sidia Institute of Science and Technology
  • Luiz Ribeiro Sidia Institute of Science and Technology
  • Edluce Veras Sidia Institute of Science and Technology
  • Gabriel Sampaio Sidia Institute of Science and Technology
  • Kaua Sanches Sidia Institute of Science and Technology
  • Renan Peres Sidia Institute of Science and Technology

Abstract


This article presents an approach for validating content in images using large language models (LLMs) and retrieval-augmented generation (RAGs) as essential tools. The proposal aims to enhance software testing processes by ensuring compliance and accuracy of visual data used. Through the integration of these technologies, the study demonstrates how it is possible to automate content verification in images, identifying inconsistencies and ensuring that data meet established standards. Additionally, the challenges and limitations of this approach, as well as its practical applications in software testing scenarios, are discussed. The results indicate that the combination of LLMs and RAGs offers an efficient and scalable solution for visual content validation, significantly contributing to the quality and reliability of testing processes.
Keywords: RAG, LLM, Software Engineering, Testing

References

Emily M Bender and Alexander Koller. 2020. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th annual meeting of the association for computational linguistics. 5185–5198.

Vanessa Câmara, Rayol Mendonca-Neto, André Silva, and Luiz Cordovil-Jr. 2023. DBVinci–towards the usage of GPT engine for processing SQL Queries. In Proceedings of the 29th Brazilian Symposium on Multimedia and the Web. 91–95.

Peter Baile Chen, Fabian Wenz, Yi Zhang, Devin Yang, Justin Choi, Nesime Tatbul, Michael Cafarella, Çağatay Demiralp, and Michael Stonebraker. 2024. BEAVER: an enterprise benchmark for text-to-sql. arXiv preprint arXiv:2409.02038 (2024).

Davide Chicco and Giuseppe Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics 21 (2020), 1–13.

André Ramos Fernandes Da Silva, Lucas Marcondes Pavelski, Luiz Alberto Queiroz Cordovil Júnior, Paulo Henrique De Oliveira Gomes, Layane Menezes Azevedo, and Francisco Erivaldo Fernandes Junior. 2022. An evolutionary search algorithm for efficient ResNet-based architectures: a case study on gender recognition. In 2022 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1–10.

Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Jingyuan Ma, Rui Li, Heming Xia, Jingjing Xu, Zhiyong Wu, Tianyu Liu, et al. 2022. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2022).

Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, MengWang, and HaofenWang. 2024. Retrieval-augmented generation for large language models: A survey, 2024. URL [link] (2024).

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems 33 (2020), 9459–9474.
Published
2025-09-22
MARQUES, Wendell; OLIVEIRA, Caina; FERNANDES, Carol; RIBEIRO, Luiz; VERAS, Edluce; SAMPAIO, Gabriel; SANCHES, Kaua; PERES, Renan. Validation of Image Content Using LLMs and RAGs: A Strategy for Ensuring Compliance in Software Testing. In: BRAZILIAN SYMPOSIUM ON SYSTEMATIC AND AUTOMATED SOFTWARE TESTING (SAST), 10. , 2025, Recife/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 153-155. DOI: https://doi.org/10.5753/sast.2025.13970.