What do we know about testing in chatbots? A systematic literature review
Abstract
The increasing use of conversational agents (chatbots) raises complex design, implementation, and, especially, testing issues. We conducted a systematic literature review and snowballing approach to characterize which tools and methods support testing activities in this application domain. As a result, we evidenced several tools that could support testing activities in chatbots, and we realized there needed to be a consensus in the field. This work’s main contribution is a characterization of state-of-the-art testing tools and methods that support the construction and validation of chatbots.References
Bird, J. J., Ekárt, A., and Faria, D. R. (2023). Chatbot interaction with artificial intelligence: human data augmentation with t5 and language transformer ensemble for text classification. Journal of Ambient Intelligence and Humanized Computing, 14(4):3129–3144.
Bozic, J., Tazl, O. A., and Wotawa, F. (2019). Chatbot testing using ai planning. In 2019 IEEE International Conference On Artificial Intelligence Testing (AITest), pages 37–44. IEEE.
Bozic, J. and Wotawa, F. (2019). Testing chatbots using metamorphic relations. In Testing Software and Systems: 31st IFIP WG 6.1 International Conference, ICTSS 2019, Paris, France, October 15–17, 2019, Proceedings 31, pages 41–55. Springer.
Božić, J. (2022). Ontology-based metamorphic testing for chatbots. Software Quality Journal, 30:227–251.
Bravo-Santos, S., Guerra, E., and de Lara, J. (2020). Testing chatbots with charm. In Quality of Information and Communications Technology: 13th International Conference, QUATIC 2020, Faro, Portugal, September 9–11, 2020, Proceedings 13, pages 426–438. Springer.
Cabot, J., Burgueno, L., Clarisó, R., Daniel, G., Perianez-Pascual, J., and Rodriguez-Echeverria, R. (2021). Testing challenges for nlp-intensive bots. In 2021 IEEE/ACM Third International Workshop on Bots in Software Engineering (BotSE), pages 31–34. IEEE.
Guerreiro, A. and Barros, D. M. V. (2019). Novos desafios da educação a distância: programação e uso de chatbots.
Guglielmi, E., Rosa, G., Scalabrino, S., Bavota, G., and Oliveto, R. (2022). Sorry, i don’t understand: Improving voice user interface testing. Association for Computing Machinery.
Kitchenham, B., Pretorius, R., Budgen, D., Brereton, O. P., Turner, M., Niazi, M., and Linkman, S. (2010). Systematic literature reviews in software engineering–a tertiary study. Information and software technology, 52(8):792–805.
Moraes, S. M. and de Souza, L. S. (2015). Uma abordagem semiautomática para expansão e enriquecimento linguístico de bases aiml para chatbots. In Congresso Internacional de Informática Educativa, volume 20, pages 600–605.
Nunes, F. O. (2012). Chatbots e mimetismo: uma conversa entre humanos, robôs e artistas. In Proceedings of 6th International Conference on Digital Arts—ARTECH, pages 89–96.
Padmanabhan, M. (2019). Sustainable test path generation for chatbots using customized response. International Journal of Engineering and Advanced Technology, 8:149–155.
Petersen, K., Vakkalanka, S., and Kuzniarz, L. (2015). Guidelines for conducting systematic mapping studies in software engineering: An update. Information and software technology, 64:1–18.
Ruane, E., Faure, T., Smith, R., Bean, D., Carson-Berndsen, J., and Ventresque, A. (2018). Botest: a framework to test the quality of conversational agents using divergent input examples. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion, pages 1–2.
Santos, M. B. D., Furtado, A. P. C., Nogueira, S. C., and Moreira, D. D. (2020). Oggybug: A test automation tool in chatbots. pages 79–87. Association for Computing Machinery.
Selvi, V., Saranya, S., Chidida, K., and Abarna, R. (2019). Chatbot and bullyfree chat. In 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), pages 1–5. IEEE.
Shawar, B. A. and Atwell, E. (2007). Chatbots: are they really useful? Journal for Language Technology and Computational Linguistics, 22(1):29–49.
Souza, P. H. C. (2022). Proposta de implementação de chatbot para o observatório do instituto do mar.
Valle, P. H. D., Vilela, R. F., and Hernandes, E. C. M. (2020). Does gamification improve the training of software testers? a preliminary study from the industry perspective. In Proceedings of the XIX Brazilian Symposium on Software Quality, pages 1–10.
Vasconcelos, M., Candello, H., Pinhanez, C., and dos Santos, T. (2017). Bottester: testing conversational systems with simulated users. In Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems, pages 1–4.
Velásquez, F. R. (2023). O chatgpt na pesquisa em humanidades digitais: Oportunidades, críticas e desafios. TEKOA, 2(2).
Vijayaraghavan, V., Cooper, J. B., and Leevinson, R. L. R. (2020). Algorithm inspection for chatbot performance evaluation. volume 171, pages 2267–2274. Elsevier B.V.
Bozic, J., Tazl, O. A., and Wotawa, F. (2019). Chatbot testing using ai planning. In 2019 IEEE International Conference On Artificial Intelligence Testing (AITest), pages 37–44. IEEE.
Bozic, J. and Wotawa, F. (2019). Testing chatbots using metamorphic relations. In Testing Software and Systems: 31st IFIP WG 6.1 International Conference, ICTSS 2019, Paris, France, October 15–17, 2019, Proceedings 31, pages 41–55. Springer.
Božić, J. (2022). Ontology-based metamorphic testing for chatbots. Software Quality Journal, 30:227–251.
Bravo-Santos, S., Guerra, E., and de Lara, J. (2020). Testing chatbots with charm. In Quality of Information and Communications Technology: 13th International Conference, QUATIC 2020, Faro, Portugal, September 9–11, 2020, Proceedings 13, pages 426–438. Springer.
Cabot, J., Burgueno, L., Clarisó, R., Daniel, G., Perianez-Pascual, J., and Rodriguez-Echeverria, R. (2021). Testing challenges for nlp-intensive bots. In 2021 IEEE/ACM Third International Workshop on Bots in Software Engineering (BotSE), pages 31–34. IEEE.
Guerreiro, A. and Barros, D. M. V. (2019). Novos desafios da educação a distância: programação e uso de chatbots.
Guglielmi, E., Rosa, G., Scalabrino, S., Bavota, G., and Oliveto, R. (2022). Sorry, i don’t understand: Improving voice user interface testing. Association for Computing Machinery.
Kitchenham, B., Pretorius, R., Budgen, D., Brereton, O. P., Turner, M., Niazi, M., and Linkman, S. (2010). Systematic literature reviews in software engineering–a tertiary study. Information and software technology, 52(8):792–805.
Moraes, S. M. and de Souza, L. S. (2015). Uma abordagem semiautomática para expansão e enriquecimento linguístico de bases aiml para chatbots. In Congresso Internacional de Informática Educativa, volume 20, pages 600–605.
Nunes, F. O. (2012). Chatbots e mimetismo: uma conversa entre humanos, robôs e artistas. In Proceedings of 6th International Conference on Digital Arts—ARTECH, pages 89–96.
Padmanabhan, M. (2019). Sustainable test path generation for chatbots using customized response. International Journal of Engineering and Advanced Technology, 8:149–155.
Petersen, K., Vakkalanka, S., and Kuzniarz, L. (2015). Guidelines for conducting systematic mapping studies in software engineering: An update. Information and software technology, 64:1–18.
Ruane, E., Faure, T., Smith, R., Bean, D., Carson-Berndsen, J., and Ventresque, A. (2018). Botest: a framework to test the quality of conversational agents using divergent input examples. In Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion, pages 1–2.
Santos, M. B. D., Furtado, A. P. C., Nogueira, S. C., and Moreira, D. D. (2020). Oggybug: A test automation tool in chatbots. pages 79–87. Association for Computing Machinery.
Selvi, V., Saranya, S., Chidida, K., and Abarna, R. (2019). Chatbot and bullyfree chat. In 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), pages 1–5. IEEE.
Shawar, B. A. and Atwell, E. (2007). Chatbots: are they really useful? Journal for Language Technology and Computational Linguistics, 22(1):29–49.
Souza, P. H. C. (2022). Proposta de implementação de chatbot para o observatório do instituto do mar.
Valle, P. H. D., Vilela, R. F., and Hernandes, E. C. M. (2020). Does gamification improve the training of software testers? a preliminary study from the industry perspective. In Proceedings of the XIX Brazilian Symposium on Software Quality, pages 1–10.
Vasconcelos, M., Candello, H., Pinhanez, C., and dos Santos, T. (2017). Bottester: testing conversational systems with simulated users. In Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems, pages 1–4.
Velásquez, F. R. (2023). O chatgpt na pesquisa em humanidades digitais: Oportunidades, críticas e desafios. TEKOA, 2(2).
Vijayaraghavan, V., Cooper, J. B., and Leevinson, R. L. R. (2020). Algorithm inspection for chatbot performance evaluation. volume 171, pages 2267–2274. Elsevier B.V.
Published
2024-07-21
How to Cite
SANTOS, Gabriel; SILVA, Williamson; VALLE, Pedro Henrique Dias.
What do we know about testing in chatbots? A systematic literature review. In: PROCEEDINGS OF WORKSHOP ON SOCIAL, HUMAN AND ECONOMIC ASPECTS OF SOFTWARE (WASHES), 9. , 2024, Brasília/DF.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 106-117.
ISSN 2763-874X.
DOI: https://doi.org/10.5753/washes.2024.2897.
