A Method for Regression Testing Plan Ordering for Non-Automated Executions in Black Box Testing

  • Vinicius Hernandes UFAM
  • André Carvalho UFAM
  • Eulanda Santos UFAM
  • Yan Soares UFAM
  • Hygo Oliveira UFAM
  • Adamor Barros UFAM
  • Ronaldo Soares UFAM
  • Alexandre Lima UFAM
  • Raoni Ferreira UFAC
  • Gabriel Martins UFAC
  • Lucas Carvalho UFAC
  • Nicolas Assumpção Motorola Mobility LLC
  • José Nascimento Motorola Mobility LLC
  • Eliane Collins INDT
  • Silvia Ascate INDT
  • Mateus Souza INDT

Abstract


In this paper, we propose a method for prioritizing regression test cases based on the probability of detecting software execution failures without source code analysis. To achieve this, our method employs the SentenceBERT model to extract embeddings from textual information of development commits and test scripts. These embeddings are then used by machine learning models to predict the probability of detecting a failure. Our experiments show that the proposed method achieves results equal to or better than those of human experts in 92.52% to 94.24% of scenarios when evaluating the APFD (Average Percentage Faults Detected) metric, an overall gain of 10% in APFD mean and a potential gain of up to 6.03% in test plan prioritization counting cases.

Keywords: Regression Test, Test Plan, APFD, Functionality Test, Black-Box Testing

References

Al-Sabbagh, K., Staron, M., Hebig, R., and Gomes, F. (2021). A classification of codechanges and test types dependencies for improving machine learning based test selection. In Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering, pages 40–49.

An, G. and Yoo, S. (2022). Fdg: a precise measurement of fault diagnosability gain of test cases. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 14–26.

Batista, G., Silva, D. F., et al. (2009). How k-nearest neighbor parameters affect its performance. In Argentine symposium on artificial intelligence, pages 1–12. Citeseer.

Breiman, L. (2001). Random forests. Machine learning, 45:5–32.

Brzezinski, J. R. and Knafl, G. J. (1999). Logistic regression modeling for context-based classification. In Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99, pages 755–759. IEEE.

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232.

Mafra, J., Miranda, B., Iyoda, J., and Sampaio, A. (2009). Test case selector: Uma ferramenta para seleção de testes. Proceedings of SBMF/SAST.

Mehta, S., Farmahinifarahani, F., Bhagwan, R., Guptha, S., Jafari, S., Kumar, R., Saini, V., and Santhiar, A. (2021). Data-driven test selection at scale. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 1225–1235.

Murtagh, F. (1991). Multilayer perceptrons for classification and regression. Neurocomputing, 2(5-6):183–197.

Omri, S. and Sinz, C. (2022). Learning to rank for test case prioritization. In Proceedings of the 15th Workshop on Search-Based Software Testing, pages 16–24.

Palma, F., Abdou, T., Bener, A., Maidens, J., and Liu, S. (2018). An improvement to test case failure prediction in the context of test case prioritization. In Proceedings of the 14th international conference on predictive models and data analytics in software engineering, pages 80–89.

Pan, C., Yang, Y., Li, Z., and Guo, J. (2020). Dynamic time window based reward for reinforcement learning in continuous integration testing. In Proceedings of the 12th Asia-Pacific Symposium on Internetware, pages 189–198.

Pradeepa, R. and VimalDevi, K. (2013). Effectiveness of testcase prioritization using apfd metric: Survey. In International Conference on Research Trends in Computer Technologies (ICRTCT—2013). Proceedings published in International Journal of Computer Applications®(IJCA), pages 0975–8887.

Ramírez, A., Feldt, R., and Romero, J. R. (2023). A taxonomy of information attributes for test case prioritisation: Applicability, machine learning. ACM Transactions on Software Engineering and Methodology, 32(1):1–42.

Reimers, N. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.

Rizwan, S., Ali Sobuj, M. S., and Akhond, M. R. (2022). A survey on software test case minimization. In Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, pages 679–684.

Shankar, R. and Sridhar, D. (2024). An improved deep learning based test case prioritization using deep reinforcement learning. International Journal of Intelligent Engineering & Systems, 17(1).

Spieker, H., Gotlieb, A., Marijan, D., and Mossige, M. (2017). Reinforcement learning for automatic test case prioritization and selection in continuous integration. In Proceedings of the 26th ACM SIGSOFT international symposium on software testing and analysis, pages 12–22.

Tahvili, S., Hatvani, L., Felderer, M., Afzal, W., Saadatmand, M., and Bohlin, M. (2018). Cluster-based test scheduling strategies using semantic relationships between test specifications. In Proceedings of the 5th International Workshop on Requirements Engineering and Testing, pages 1–4.

Wu, Z., Yang, Y., Li, Z., and Zhao, R. (2019). A time window based reinforcement learning reward for test case prioritization in continuous integration. In Proceedings of the 11th Asia-Pacific Symposium on Internetware, pages 1–6.

Zhang, J., Liu, Y., Gligoric, M., Legunsen, O., and Shi, A. (2022). Comparing and combining analysis-based and learning-based regression test selection. In Proceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test, pages 17–28.
Published
2025-05-12
HERNANDES, Vinicius et al. A Method for Regression Testing Plan Ordering for Non-Automated Executions in Black Box Testing. In: IBERO-AMERICAN CONFERENCE ON SOFTWARE ENGINEERING (CIBSE), 28. , 2025, Ciudad Real/Espanha. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 120-134. DOI: https://doi.org/10.5753/cibse.2025.35296.