Avaliação empírica da geração automatizada de testes de software sob a perspectiva de Test Smells

Tássio Guerreiro Antunes Virgínio; Ivan Machado

doi:10.5753/cbsoft_estendido.2021.17292

Tássio Guerreiro Antunes Virgínio UFBA
Ivan Machado UFBA

DOI: https://doi.org/10.5753/cbsoft_estendido.2021.17292

Resumo

O teste de software é uma atividade-chave para o desenvolvimento de software de qualidade, sendo tão ou mais custoso do que o desenvolvimento do código de produção. De modo a reduzir os custos de projetos de software, o uso de ferramentas de geração automatizada de testes, tais como Randoop e Evosuite tem sido fortemente encorajado. No entanto, é necessário obter evidências sobre como o uso dessas ferramentas afeta a qualidade dos testes. Neste sentido, o presente estudo apresenta uma avaliação empírica da qualidade de testes gerados automaticamente, sob a perspectiva de test smells. O estudo contemplou o desenvolvimento de uma ferramenta open source de coleta e análise automatizada de test smells, a JNose Test, no desenvolvimento da ferramenta foi realizado estudo utilizando 11 projetos open source para verificar a relação entre test smells e a cobertura dos testes, e posteriormente no segundo estudo que avaliou a qualidade de testes gerados automaticamente para 21 projetos open source. No primeiro estudo encontramos relações fortes entre métricas de cobertura e test smells, e no segundo estudo os resultados indicam uma alta difusão dos test smells e co-ocorrências entre diferentes tipos de test smells nos projetos avaliados. Além disso, foi identificada uma alta difusão de test smells nos códigos de teste gerados pelas ferramentas Evosuite e Randoop que, frequentemente, estão correlacionados.

Referências

Almasi, M. M., Hemmati, H., Fraser, G., Arcuri, A., and Benefelds, J. (2017). An industrial evaluation of unit test generation: Finding real faults in a financial application. In International Conference on Software Engineering (ICSE-SEIP). IEEE CS.

Bavota, G., Qusef, A., Oliveto, R., Lucia, A. D., and Binkley, D. W. (2015). Are test smells really harmful? an empirical study. Empir. Softw. Eng., 20(4):1052–1094.

CISQ (2021). The Cost of Poor Software Quality in the US: A 2020 Report. https://www.it-cisq.org/pdf/CPSQ-2020-report.pdf. Acessed: March 1st, 2021.

Fraser, G. and Arcuri, A. (2014). A large-scale evaluation of automated unit test generation using evosuite. ACM Trans. on Software Engineering and Methodology, 24:1–42.

Garousi, V. and Kucuk, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138:52–81.

Gopinath, R., Jensen, C., and Groce, A. (2014). Code coverage for suite evaluation by developers. In Proceedings of the 36th International Conference on Software Engineering, ICSE.

Grano, G., Palomba, F., Nucci, D. D., Lucia, A. D., and Gall, H. C. (2019). Scented since the beginning: On the diffuseness of test smells in automatically generated test code. Journal of Systems and Software, 156:312-327.

Greiler, M., van Deursen, A., and Storey, M. D. (2013). Automated detection of test In 6th International Conference on Software Testing, fixture strategies and smells. Verification and Validation (ICST). IEEE Computer Society.

Meszaros, G., Smith, S., and Andrea, J. (2003). The test automation manifesto. In Maurer, F. and Wells, D., editors, Third XP and Second Agile Universe Conference, volume 2753 of LNCS. Springer.

Pacheco, C. and Ernst, M. (2007). Randoop: Feedback-directed random testing for java. Proceedings of the Conference on Object-Oriented Programming Systems, Languages, and Applications, OOPSLA, pages 815–816.

Palomba, F., Nucci, D. D., Panichella, A., Oliveto, R., and Lucia, A. D. (2016). On the diffusion of test smells in automatically generated test code: An empirical study. In 9th International Workshop on Search-Based Software Testing (SBST). ACM.

Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2019). On the distribution of test smells in open source android applications: An exploratory study. In 29th Annual International Conference on Computer Science and Software Engineering (CASCON). ACM.

Silva, I. P. S. C., Alves, E. L. G., and Andrade, W. L. (2017). Analyzing automatic test generation tools for refactoring validation. pages 38–44.

Somé, S. S. and Cheng, X. (2008). An approach for supporting system-level test scenarios generation from textual use cases. In ACM Symposium on Applied Computing. ACM.

Spadini, D., Schvarcbacher, M., Oprescu, A.-M., Bruntink, M., and Bacchelli, A. (2020). Investigating severity thresholds for test smells. In 17th International Conference on Mining Software Repositories (MSR). ACM.

van Deursen, A., Moonen, L., van den Bergh, A., and Kok, G. (2001). Refactoring In 2nd International Conference on Extreme Programming and Flexible test code. Processes in Software Engineering (XP).

Virgínio, T., Martins, L., Rocha, L., Santana, R., Costa, H., and Machado, I. (2020a). An empirical study of automatically-generated tests from the perspective of test smells. In 34th Brazilian Symposium on Software Engineering (SBES). ACM.

Virgínio, T., Martins, L., Rocha, L., Santana, R., Cruz, A., Costa, H., and Machado, I. (2020b). JNose: Java Test Smell Detector. In 34th Brazilian Symposium on Software Engineering (SBES). ACM.

Virgínio, T., Martins, L., Rocha, L., Santana, R., Cruz, A., Costa, H., and Machado, I. (2021). On the test smells detection: an empirical study on the jnose test accuracy. Journal of Software Engineering Research and Development (JSERD). Em avaliação.

Virgínio, T., Railana, S., Martins, L. A., Soares, L. R., Costa, H., and Machado, I. (2019). In 33rd Brazilian Symposium on On the inuence of test smells on test coverage. Software Engineering (SBES). ACM.

Wood, C. L. and Altavela, M. M. (1978). Large-sample results for Kolmogorov-Smirnov statistics for discrete distributions. Biometrika, 65(1):235–239.