Avaliação empírica da geração automatizada de testes de software sob a perspectiva de Test Smells
Abstract
Software testing is a crucial activity for quality software development, but it is usually more costly than production code development. To reduce software projects' costs, automated test generation tools such as Randoop and Evosuite have been strongly encouraged. However, evidence is needed on how the use of these tools affects test quality. In this sense, this study presents an empirical evaluation of the quality of automatically generated tests from the perspective of test smells. The study included developing an open-source automated collection and analysis of test smells tool, the JNose Test. In the development of the tool, a study was carried out using 11 open source projects to verify the relationship between test smells and test coverage, and later in the second study that assessed the quality of automatically generated tests for 21 open source projects. In the first study, we found strong relationships between coverage metrics and test smells, and in the second study, the results indicate a high diffusion of test smells and co-occurrences between different types of test smells in the evaluated projects. Furthermore, we identified a high diffusion of test smells in the test codes generated by the Evosuite and Randoop tools, which are often correlated.
References
Bavota, G., Qusef, A., Oliveto, R., Lucia, A. D., and Binkley, D. W. (2015). Are test smells really harmful? an empirical study. Empir. Softw. Eng., 20(4):1052–1094.
CISQ (2021). The Cost of Poor Software Quality in the US: A 2020 Report. https://www.it-cisq.org/pdf/CPSQ-2020-report.pdf. Acessed: March 1st, 2021.
Fraser, G. and Arcuri, A. (2014). A large-scale evaluation of automated unit test generation using evosuite. ACM Trans. on Software Engineering and Methodology, 24:1–42.
Garousi, V. and Kucuk, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138:52–81.
Gopinath, R., Jensen, C., and Groce, A. (2014). Code coverage for suite evaluation by developers. In Proceedings of the 36th International Conference on Software Engineering, ICSE.
Grano, G., Palomba, F., Nucci, D. D., Lucia, A. D., and Gall, H. C. (2019). Scented since the beginning: On the diffuseness of test smells in automatically generated test code. Journal of Systems and Software, 156:312-327.
Greiler, M., van Deursen, A., and Storey, M. D. (2013). Automated detection of test In 6th International Conference on Software Testing, fixture strategies and smells. Verification and Validation (ICST). IEEE Computer Society.
Meszaros, G., Smith, S., and Andrea, J. (2003). The test automation manifesto. In Maurer, F. and Wells, D., editors, Third XP and Second Agile Universe Conference, volume 2753 of LNCS. Springer.
Pacheco, C. and Ernst, M. (2007). Randoop: Feedback-directed random testing for java. Proceedings of the Conference on Object-Oriented Programming Systems, Languages, and Applications, OOPSLA, pages 815–816.
Palomba, F., Nucci, D. D., Panichella, A., Oliveto, R., and Lucia, A. D. (2016). On the diffusion of test smells in automatically generated test code: An empirical study. In 9th International Workshop on Search-Based Software Testing (SBST). ACM.
Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2019). On the distribution of test smells in open source android applications: An exploratory study. In 29th Annual International Conference on Computer Science and Software Engineering (CASCON). ACM.
Silva, I. P. S. C., Alves, E. L. G., and Andrade, W. L. (2017). Analyzing automatic test generation tools for refactoring validation. pages 38–44.
Somé, S. S. and Cheng, X. (2008). An approach for supporting system-level test scenarios generation from textual use cases. In ACM Symposium on Applied Computing. ACM.
Spadini, D., Schvarcbacher, M., Oprescu, A.-M., Bruntink, M., and Bacchelli, A. (2020). Investigating severity thresholds for test smells. In 17th International Conference on Mining Software Repositories (MSR). ACM.
van Deursen, A., Moonen, L., van den Bergh, A., and Kok, G. (2001). Refactoring In 2nd International Conference on Extreme Programming and Flexible test code. Processes in Software Engineering (XP).
Virgínio, T., Martins, L., Rocha, L., Santana, R., Costa, H., and Machado, I. (2020a). An empirical study of automatically-generated tests from the perspective of test smells. In 34th Brazilian Symposium on Software Engineering (SBES). ACM.
Virgínio, T., Martins, L., Rocha, L., Santana, R., Cruz, A., Costa, H., and Machado, I. (2020b). JNose: Java Test Smell Detector. In 34th Brazilian Symposium on Software Engineering (SBES). ACM.
Virgínio, T., Martins, L., Rocha, L., Santana, R., Cruz, A., Costa, H., and Machado, I. (2021). On the test smells detection: an empirical study on the jnose test accuracy. Journal of Software Engineering Research and Development (JSERD). Em avaliação.
Virgínio, T., Railana, S., Martins, L. A., Soares, L. R., Costa, H., and Machado, I. (2019). In 33rd Brazilian Symposium on On the inuence of test smells on test coverage. Software Engineering (SBES). ACM.
Wood, C. L. and Altavela, M. M. (1978). Large-sample results for Kolmogorov-Smirnov statistics for discrete distributions. Biometrika, 65(1):235–239.
