On the test smells detection: an empirical study on the JNose Test accuracy
Keywords:Tests Quality, Test Evolution, Test Smells, Evidence-based Software Engineering
Several strategies have been proposed for test quality measurement and analysis. Code coverage is likely the most widely used one. It enables to verify the ability of a test case to cover as many source code branches as possible. Although code coverage has been widely used, novel strategies have been recently employed.
Test smells analysis, for example, has been introduced as an affordable strategy to evaluate the test code quality. Test smells are poor design choices in implementation, and their occurrence in test code might reduce the test suites quality. ~The test smells identification is most dependent on tool support; otherwise, it could become a cost-ineffective strategy. In an earlier study, we proposed the JNose Test, a tool to analyze test suite quality from the test smells perspective. The JNose Test detects twenty-one types of test smells throughout software versions. This study extends the previous one in two directions: i) the test smells detection rules were extracted to an API, named JNose-Core, that provides an extensible architecture for the implementation of new detection rules or programming languages; and ii) we performed an empirical study to evaluate the tool effectiveness for the test smells detection and a comparison between the JNose Test and the state-of-the-art tool, the tsDetect. Results showed that the JNose-Core precision score ranges from 91\% to 100\%, and the recall score from 89\% to 100\%. It also presented a slight improvement in the test smells detection rules compared to the tsDetect for the test smells detection at the class-level.
Bavota,G.,Qusef,A.,Oliveto,R.,DeLucia,A.,and Binkley, D. (2015). Are test smells really harmful? an empirical study. Empirical Software Engineering,20(4):1052–1094.
Bavota, G., Qusef, A., Oliveto, R., Lucia, A., and Binkley, D. (2012). An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 28th IEEE International Conference on Software Maintenance (ICSM).
Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., and Marinov, D. (2018). DeFlaker: Automatically Detecting Flaky Tests. In IEEE/ACM 40th International Conferenceon Software Engineering (ICSE), pages 433–444.
Capgemini (2018). World Quality Report 201819. https://www.capgemini.com/service/world-quality-report-2018-19/.Accessed:March 1st, 2021.
CISQ (2021). The Cost of Poor Software Quality in theUS: A 2020 Report. https://www.it-cisq.org/pdf/CPSQ-2020-report.pdf. Accessed: March 1st, 2021.
Deursen, A., Moonen, L. M., Bergh, A., and Kok, G. (2001). Refactoring test code. InRefactoring Test Code, Amsterdam, The Netherlands, The Netherlands. CWI (Centre for Mathematics and Computer Science).
Garousi, V. and Küçük, B. (2018). Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software, 138:52 – 81.
Gopinath, R., Jensen, C., and Groce, A. (2014). Code coverage for suite evaluation by developers. InProceedings of the 36th International Conference on Software Engineering (ICSE), New York, NY, USA. ACM.
Grano,G.,Palomba,F.,DiNucci,D.,DeLucia,A.,andGall,H. C. (2019). Scented since the beginning: On the diffuseness of test smells in automatically generated test code. Journal of Systems and Software, 156:312–327.
Greiler, M., van Deursen, A., and Storey, M. (2013). Automated detection of test fixture strategies and smells. In IEEE Sixth International Conference on Software Testing, Verification and Validation, pages 322–331.
Guerra Calle, D., Delplanque, J., and Ducasse, S. (2019).Exposing Test Analysis Results with DrTests. In International Workshop on Smalltalk Technologies, pages 1–5, Cologne, Germany. HAL.
Hallgren, K. A. (2012). Computing interrater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology, 8(1):23.
Junior, N. S., Rocha, L., Martins, L. A., and Machado, I.(2020). A survey on test practitioners’ awareness of test smells. InProceedings of the XXIII Iberoamerican ConferenceonSoftwareEngineering, CIbSE2020, pages 462–475. Curran Associates.
Koochakzadeh, N. and Garousi, V. (2010). TeCReVis: A Tool for Test Coverage and Test Redundancy Visualization. In Bottaci, L. and Fraser, G., editors, Testing – PracticeandResearchTechniques,pages129–136, Berlin, Heidelberg. Springer Berlin Heidelberg.
Meszaros, G., Smith, S. M., and Andrea, J. (2003). The testautomation manifesto. In Maurer, F. and Wells, D., editors,ExtremeProgrammingandAgileMethodsXP/AgileUniverse 2003, Berlin, Heidelberg. Springer Berlin Heidelberg.
Negar, K. and Garousi, V. (2010). A testerassisted methodologyfortestredundancydetection.AdvancesinSoftwareEngineering, 2010.
Palomba, F., Zaidman, A., and Lucia, A. D. (2018). Automatic test smell detection using information retrieval techniques. In IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 311–322, Madrid, Spain. IEEE.
Pecorelli, F., Di Lillo, G., Palomba, F., and De Lucia, A.(2020). VITRuM: A PlugIn for the Visualization of TestRelated Metrics. InProceedings of the International Conference on Advanced Visual Interfaces, New York, NY,USA. ACM.
Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2019). On the distribution of test smells in open source android applications: An exploratory study. InProceedings of the 29th Annual International Conference on Computer Science and Software Engineering (CASCON), Riverton, NJ, USA. IBM.
Peruma, A., Almalki, K., Newman, C. D., Mkaouer, M. W., Ouni, A., and Palomba, F. (2020). TsDetect: An OpenSource Test Smells Detection Tool. ACM, New York, NY, USA.
Santana, R., Martins, L., Rocha, L., Virginio, T., Cruz, A.,Costa, H., and Machado, I. (2020). RAIDE: A Tool forAssertionRouletteandDuplicateAssertIdentificationandRefactoring. InProceedings of the 34th Brazilian Symposium on Software Engineering (SBES). ACM.
Spadini, D., Palomba, F., Zaidman, A., Bruntink, M., andBacchelli, A. (2018). On the relation of test smells to software code quality. In International Conference on Software Maintenance and Evolution (ICSME), pages 1–12.IEEE.
Spadini, D., Schvarcbacher, M., Oprescu, A.M., Bruntink, M.,andBacchelli, A.(2020). Investigatingseveritythresholds for test smells. InProceedings of the 17th International Conference on Mining Software Repositories(MSR). ACM.
Virginio, T., Martins, L., Soares, L. R., Railana, S., Costa,H., and Machado, I. (2020). An empirical study ofautomaticallygenerated tests from the perspective of test smells. InProceedingsoftheXXXIVBrazilianSymposiumon Software Engineering (SBES), New York, NY, USA.ACM.
Virginio,T.,Santana,R.,Martins,L.A.,Soares,L.R.,Costa,H.,andMachado,I.(2019). Ontheinfluenceoftestsmellson test coverage. InProceedings of the XXXIII BrazilianSymposium on Software Engineering (SBES), pages 467–471, New York, NY, USA. ACM.
Virgínio, T., Martins, L., Santana, R., Cruz, A., Rocha, L.,Costa, H., and Machado, I. (2021). On the test smells detection: an empirical study on the JNose Test accuracy[Dataset]. Available at:https://doi.org/10.5281/zenodo.4570751.
Yusifoğlu, V. G., Amannejad, Y., and Can, A. B. (2015). Software test-code engineering: A systematic mapping.Information and Software Technology, 58:123 – 147
How to Cite
Copyright (c) 2021 Tássio Virgínio, Luana Martins, Railana Santana, Adriana Cruz, Larissa Rocha, Heitor Costa, Ivan Machado
This work is licensed under a Creative Commons Attribution 4.0 International License.