The Influence of Using LLMs on the Activities of a Software Testing Team: An Industry Case

Flávia Oliveira; Leonardo Tiago; Lennon Chaves

doi:10.5753/sast.2025.13592

Flávia Oliveira Sidia Institute of Science and Technology
Leonardo Tiago Sidia Institute of Science and Technology
Lennon Chaves Sidia Institute of Science and Technology

DOI: https://doi.org/10.5753/sast.2025.13592

Resumo

Context: In this study, we investigated the integration of Large Language Models (LLMs) in test teams, aiming to understand how these tools can improve the test process and improve productivity. The main goal was to determine if the implementation of LLMs can enhance the productivity of test professionals. Methodology: We performed an empirical study with 14 participants, and the results were analysed quantitatively and qualitatively. For the quantitative analysis, we analysed the frequency of activities, and for the qualitative analysis, we analysed the participants’ perception of productivity through a scoring system, in which participants gave a score between 0 to 5 for their perception of learning from before and after the implementation of LLM in the team. Results: The results show a significant improvement in productivity (p-value = 0,0193) after the implementation of LLM. Furthermore, the qualitative analysis provided data on the impact of LLM in learning, use experience, trust perception and productivity. Conclusion: It was noted that the integration of LLMs in test teams can be a valuable tool to improve productivity and efficiency of test professionals.

Palavras-chave: Large Language Models, Software Industry, Software Testing

Referências

Victor R Basili. 1994. Goal, question, metric paradigm. Encyclopedia of software engineering 1 (1994), 528–532.

Vahit Bayrı and Ece Demirel. 2023. Ai-powered software testing: The impact of large language models on testing methodologies. In 2023 4th International Informatics and Software Engineering Conference (IISEC). IEEE, 1–4.

Gerald D Everett and Raymond McLeod Jr. 2007. Software testing: testing across the entire software development life cycle. John Wiley & Sons.

Vitor Guilherme and Auri Vincenzi. 2023. An initial investigation of ChatGPT unit test generation capability. In Proceedings of the 8th Brazilian Symposium on Systematic and Automated Software Testing. 15–24.

Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. Large language models for software engineering: A systematic literature review. ACM Transactions on Software Engineering and Methodology 33, 8 (2024), 1–79.

Muhammad Abid Jamil, Muhammad Arif, Normi Sham Awang Abubakar, and Akhlaq Ahmad. 2016. Software testing techniques: A literature review. In 2016 6th international conference on information and communication technology for the Muslim world (ICT4M). IEEE, 177–182.

Laura Plein, Wendkûuni C. Ouédraogo, Jacques Klein, and Tegawendé F. Bissyandé. 2024. Automatic Generation of Test Cases Based on Bug Reports: a Feasibility Study with Large Language Models. In 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 360–361. DOI: 10.1145/3639478.3643119

Sanjay Kumar Singh and Amarjeet Singh. 2012. Software testing. Vandana Publications.

Junjie Wang, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, and Qing Wang. 2024. Software testing with large language models: Survey, landscape, and vision. IEEE Transactions on Software Engineering (2024).

Claes Wohlin, Per Runeson, Martin Hst, Magnus C. Ohlsson, Bjrn Regnell, and Anders Wessln. 2012. Experimentation in Software Engineering. Springer Publishing Company, Incorporated.