An Intelligent Agent for Automated Test Generation from OpenAPI Specifications

Ricardo F. Vilela; Stevão Alves de Andrade; Eduardo B. F. Santos; Williamson Silva; Pedro H. D. Valle

doi:10.5753/wbots.2025.15217

Ricardo F. Vilela Unicamp
Stevão Alves de Andrade UFAL
Eduardo B. F. Santos UFAL
Williamson Silva UFCA
Pedro H. D. Valle USP

DOI: https://doi.org/10.5753/wbots.2025.15217

Resumo

Context: The widespread adoption of RESTful APIs demands effective contract validation, especially in critical domains. Motivation: Tools such as Postman and Newman automate test execution, but test creation remains manual, error-prone, and difficult to maintain. Objective: This work proposes an intelligent agent based on LLMs to interpret OpenAPI documents and automatically generate test collections. Method: The solution fragments OpenAPI documents by endpoint, generates specialized prompts, and uses an LLM via OpenRouter to create automated tests, validated with Newman on a public (anonymized) API. Results: Execution produced 34 requests, of which 12 were successful, 14 had critical failures (malformed URIs), and 8 contained structural or semantic errors. The results demonstrate feasibility and reduced manual effort, while indicating the need for further validation and refinement for CI/CD use.

Referências

Ammann, P. and Offutt, J. (2016). Introduction to software testing. Cambridge University Press, 2 edition.

Arcuri, A. (2019). Restful api automated test case generation with evomaster. ACM Transactions on Software Engineering and Methodology (TOSEM), 28(1):1–37.

Arcuri, A., Poth, A., and Rrjolli, O. (2025). Introducing black-box fuzz testing for rest apis in industry: Challenges and solutions. In IEEE International Conference on Software Testing, Verification and Validation (ICST).

Arcuri, A., Zhang, M., and Galeotti, J. (2024). Advanced white-box heuristics for search-based fuzzing of rest apis. ACM Transactions on Software Engineering and Methodology, 33(6):1–36.

Balaji, P. G. and Srinivasan, D. (2010). An Introduction to Multi-Agent Systems, pages 1–27. Springer Berlin Heidelberg, Berlin, Heidelberg.

Dullabh, P., Hovey, L., Heaney-Huls, K., Rajendran, N., Wright, A., and Sittig, D. F. (2019). Application programming interfaces (apis) in health care: findings from a current-state assessment. In Context Sensitive Health Informatics: Sustainability in Dynamic Ecosystems, pages 201–206. IOS Press.

Duran, J. W. and Ntafos, S. C. (1984). An evaluation of random testing. IEEE transactions on Software Engineering, (4):438–444.

Ehsan, A., Abuhaliqa, M. A. M. E., Catal, C., and Mishra, D. (2022). Restful api testing methodologies: Rationale, challenges, and solution directions. Applied Sciences, 12(9).

Erlenhov, L., Neto, F. G. d. O., and Leitner, P. (2020). An empirical study of bots in software development: characteristics and challenges from a practitioner’s perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, page 445–455, New York, NY, USA. Association for Computing Machinery.

Ferreira Vilela, R., Choma Neto, J., Santiago Costa Pinto, V. H., Lopes de Souza, P. S., and do Rocio Senger de Souza, S. (2023). Bio-inspired optimization to support the test data generation of concurrent software. Concurrency and Computation: Practice and Experience, 35(2):e7489.

Fielding, R. T. (2000). Architectural styles and the design of network-based software architectures. University of California, Irvine.

Fielding, R. T. and Reschke, J. (2014). Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. RFC 7231.

Golmohammadi, A., Zhang, M., and Arcuri, A. (2023). Testing restful apis: A survey. ACM Transactions on Software Engineering and Methodology, 33(1):1–41.

Graham, D., Veenendaal, E. v., Evans, I., and Black, R. (2008). Foundations of software testing: ISTQB certification. Intl Thomson Business Pr.

Kim, M., Stennett, T., Shah, D., Sinha, S., and Orso, A. (2024). Leveraging large language models to improve rest api testing. In Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results, ICSE-NIER’24, page 37–41, New York, NY, USA. Association for Computing Machinery.

Kim, M., Xin, Q., Sinha, S., and Orso, A. (2022). Automated test generation for rest apis: No time to rest yet. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 289–301.

Marculescu, B., Zhang, M., and Arcuri, A. (2022). On the faults found in rest apis by automated test generation. ACM Trans. Softw. Eng. Methodol., 31(3).

Martin-Lopez, A., Segura, S., and Ruiz-Cortés, A. (2019). Test coverage criteria for restful web apis. In Proceedings of the 10th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation, pages 15–21.

Martin-Lopez, A., Segura, S., and Ruiz-Cortés, A. (2022). Online testing of restful apis: promises and challenges. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2022, page 408–420, New York, NY, USA. Association for Computing Machinery.

McMinn, P. (2004). Search-based software test data generation: a survey: Research articles. Softw. Test. Verif. Reliab., 14(2):105–156.

Miller, D., Whitlocak, J., Gartiner, M., Ralphson, M., Ratovsky, R., and Sarid, U. (2021). OpenAPI specification v3.1.0.

Nooyens, R., Bardakci, T., Beyazit, M., and Demeyer, S. (2025). Test amplification for rest apis via single and multi-agent llm systems.

Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., and Karri, R. (2025). Asleep at the keyboard? assessing the security of github copilot’s code contributions. Commun. ACM, 68(2):96–105.

Russell, S. and Norvig, P. (2009). Artificial Intelligence: A Modern Approach. Prentice Hall Press, USA, 3rd edition.

Santhanam, S., Hecking, T., Schreiber, A., and Wagner, S. (2022). Bots in software engineering: a systematic mapping study. PeerJ Computer Science, 8:e866.