AutomTest 3.0: An automated test-case generation tool from User Story processing powered with LLMs

Joanderson Gonçalves Santos; Rita Suzana Pitangueira Maciel

doi:10.5753/sbes.2024.3603

Joanderson Gonçalves Santos UFBA
Rita Suzana Pitangueira Maciel UFBA

DOI: https://doi.org/10.5753/sbes.2024.3603

Resumo

Test-driven development (TDD) and Behavior-driven development (BDD) approaches address software development issues, proposing the creation of unit tests for each system component before implementing the code for the corresponding functionality. This approach results in an objective and compact solution. It ensures that the entire codebase is tested and enhances the overall quality of the software. However, the testing process can be time-consuming and resource-intensive. Some tools were proposed to address those issues, but they lack functionalities based on User Story in a TDD approach. In this context, we propose the AutomTest 3.0 tool, which facilitates unit test case generation from user stories prior to writing the code base related to the functionality. Built upon previous versions of AutomTest, this tool extends the other ones by including generative AI to power the creation of suggestions for methods the software under test should have. This tool creates test cases for Java using techniques such as equivalence class partitioning, natural language processing, and generative artificial intelligence. To evaluate the effectiveness of AutomTest 3.0, we conducted an exploratory study with software development professionals. Their feedback highlighted the tool’s utility in their daily work routines. AutomTest 3.0 demonstrated promising results in test case generation, scenario coverage, and speed advantages in test case creation. https://youtu.be/xoTrvhlfvu8

Palavras-chave: Test Case Generation, Test-Driven Development (TDD), Natural Language Processing (NLP), Generative AI, User Story

Referências

Pierre Bourque and R.E. Fairley. 2014. Guide to the Software Engineering Body of Knowledge - SWEBOK V3.0. IEEE and IEEE Computer Society, Washington, DC, United States.

Tom B. Brown, Benjamin Mann, Nick Ryder, and et al. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS ’20). Curran Associates Inc., Red Hook, NY, USA, Article 159, 25 pages.

Yinghao Chen and et al. 2024. ChatUniTest: A Framework for LLM-Based Test Generation. In Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering (Porto de Galinhas, Brazil) (FSE 2024). ACM, New York, NY, USA, 572–576. DOI: 10.1145/3663529.3663801

Mike Cohn. 2004. Advantages of User Stories for Requirements. InformIT Network. Available at: [link].

Venâncio de Sá Oliveira. 2024. AutomTest 2.0: Ferramenta de processamento de User Stories para geração de casos de testes automatizados. Undergraduate Thesis – Federal University of Bahia, Salvador, Brasil..

Narayan Debnath, Adam Kruger, and Melinda Alexander. 2013. A Boundary Value Analysis Tool - Design and Description, In ITNG ’13: Proceedings of the 2013 10th International Conference on Information Technology: New Generations. Proceedings of the 2013 10th International Conference on Information Technology: New Generations, ITNG 2013 1, 1, 77–82. DOI: 10.1109/ITNG.2013.20

Daniel Fernandes and Rita Maciel. 2020. Towards a Test Case Generation Tool Based on Functional Requirements. In Anais do XIX Simpósio Brasileiro de Qualidade de Software (São Luiz do Maranhão). SBC, Porto Alegre, RS, Brasil, 386–391. [link]

Tuğçe Güneş and Fatma Başak Aydemir. 2020. Automated Goal Model Extraction from User Stories Using NLP. In 2020 IEEE 28th International Requirements Engineering Conference (RE). IEEE and IEEE Computer Society, Washington, DC, United States, 382–387. DOI: 10.1109/RE48521.2020.00052

Muhammad Abid Jamil and et al. [n. d.]. Software Testing Techniques: A Literature Review. ([n. d.]), 177–182. DOI: 10.1109/ICT4M.2016.045

Sungmin Kang, Juyeon Yoon, and Shin Yoo. 2023. Large Language Models are Few-Shot Testers: Exploring LLM-Based General Bug Reproduction. In Proceedings of the 45th International Conference on Software Engineering (Melbourne, Victoria, Australia) (ICSE ’23). IEEE Press, Melbourne, Victoria, Australia, 2312–2323. DOI: 10.1109/ICSE48619.2023.00194

Enkelejda Kasneci, Kathrin Sessler, and et al. 2023. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences 103 (2023), 102274. DOI: 10.1016/j.lindif.2023.102274

E.M. Maximilien and L. Williams. 2003. Assessing Test-Driven Development at IBM. (2003), 564–569.

Pallavi Pandit and Swati Tahiliani. 2015. AgileUAT: A Framework for User Acceptance Testing based on User Stories and Acceptance Criteria. International Journal of Computer Applications 120 (06 2015), 16–21. DOI: 10.5120/21262-3533

Indra Raharjana, Daniel Siahaan, and Chastine Fatichah. 2021. User Stories and Natural Language Processing: A Systematic Literature Review. IEEE Access PP (04 2021), 1–1. DOI: 10.1109/ACCESS.2021.3070606

V. Sateoli. 2022. AutomTest 2.0 Source Code. [link]. Accessed: 2024-07-14.

Max Schafer, Sarah Nadi, Aryaz Eghbali, and Frank Tip. 2024. An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation. IEEE Transactions on Software Engineering 50, 1 (1 jan 2024), 85–105. DOI: 10.1109/TSE.2023.3334955 Publisher Copyright: © 1976-2012 IEEE..

Murray Shanahan, Kyle McDonell, and Laria Reynolds. 2023. Role play with large language models. Nature 623 (11 2023). DOI: 10.1038/s41586-023-06647-8

Forrest Shull, Grigori Melnik, Burak Turhan, Lucas Layman, Madeline Diep, and Hakan Erdogmus. 2011. What Do We Know about Test-Driven Development? Software, IEEE 27 (01 2011), 16 – 19. DOI: 10.1109/MS.2010.152