ABSTRACT
Software testing is an important part of software development. In addition to finding faults in the code, good tests should help developers to correct faults and be easily updated in case of modifications due to code changes. Automatically generated tests can save time and may lead to higher code coverage. However, these tests may be less readable and not based on realistic scenarios. Few pieces of research have been done to evaluate whether automatically generated tests are maintainable and support developers when maintaining code. To further investigate this issue, we perform an empirical study with 20 real developers to compare how they perform maintenance tasks with automatically generated (Evosuite or Randoop) and manually-written test cases. Our results indicate that automatically generated tests can be a great help for identifying faults during maintenance. Also, we found that all strategies were similar at helping to produce correct bug fixes and with similar efficiency. Therefore, we can say that developers may integrate generated test suites into the project at any stage.
- Andrea Arcuri and Lionel Briand. 2014. A hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability 24, 3 (2014), 219--250.Google ScholarDigital Library
- Sebastian Bauersfeld, Tanja EJ Vos, and Kiran Lakhotia. 2013. Unit testing tool competitions-lessons learned. In International Workshop on Future Internet Testing. Springer, 75--94.Google Scholar
- Jeffrey Carver, Letizia Jaccheri, Sandro Morasca, and Forrest Shull. 2004. Issues in using students in empirical studies in software engineering education. In Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No. 03EX717). IEEE, 239--249.Google Scholar
- Christoph Csallner and Yannis Smaragdakis. 2004. JCrasher: an automatic robustness tester for Java. Software: Practice and Experience 34, 11 (2004), 1025--1050.Google ScholarDigital Library
- Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer. 2015. Modeling readability to improve unit tests. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 107--118.Google ScholarDigital Library
- Ronald A Fisher. 1922. On the interpretation of &khgr; 2 from contingency tables, and the calculation of P. Journal of the Royal Statistical Society 85, 1 (1922), 87--94.Google ScholarCross Ref
- Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416--419.Google ScholarDigital Library
- Gordon Fraser and Andrea Arcuri. 2013. Evosuite: On the challenges of test case generation in the real world. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation. IEEE, 362--369.Google ScholarDigital Library
- Gordon Fraser and Andrea Arcuri. 2014. A large-scale evaluation of automated unit test generation using evosuite. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 2 (2014), 1--42.Google ScholarDigital Library
- Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2015. Does automated unit test generation really help software testers? a controlled empirical study. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 4 (2015), 1--49.Google ScholarDigital Library
- Robert L Glass. 2002. Software engineering: facts and fallacies. (2002).Google Scholar
- Mary Jean Harrold and ML Souffa. 1988. An incremental approach to unit testing during maintenance. In Proceedings. Conference on Software Maintenance, 1988. IEEE, 362--367.Google ScholarCross Ref
- Fitsum Kifetew, Xavier Devroey, and Urko Rueda. 2019. Java unit testing tool competition-seventh round. In 2019 IEEE/ACM 12th International Workshop on Search-Based Software Testing (SBST). IEEE, 15--20.Google ScholarDigital Library
- Barbara A Kitchenham, Shari Lawrence Pfleeger, Lesley M Pickard, Peter W Jones, David C. Hoaglin, Khaled El Emam, and Jarrett Rosenberg. 2002. Preliminary guidelines for empirical research in software engineering. IEEE Transactions on software engineering 28, 8 (2002), 721--734.Google ScholarDigital Library
- Urko Rueda Molina, Fitsum Kifetew, and Annibale Panichella. 2018. Java unit testing tool competition-sixth round. In 2018 IEEE/ACM 11th International Workshop on Search-Based Software Testing (SBST). IEEE, 22--29.Google Scholar
- Melina Mongiovi. 2016. Scaling testing of refactoring engines. In Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity. 15--17.Google ScholarDigital Library
- Melina Mongiovi, Rohit Gheyi, Gustavo Soares, Márcio Ribeiro, Paulo Borba, and Leopoldo Teixeira. 2017. Detecting overly strong preconditions in refactoring engines. IEEE Transactions on Software Engineering 44, 5 (2017), 429--452.Google ScholarCross Ref
- Glenford J Myers, Corey Sandler, and Tom Badgett. 2011. The art of software testing. John Wiley & Sons.Google ScholarDigital Library
- Akira K Onoma, Wei-Tek Tsai, Mustafa Poonawala, and Hiroshi Suganuma. 1998. Regression testing in an industrial environment. Commun. ACM 41, 5 (1998), 81--86.Google ScholarDigital Library
- Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815--816.Google Scholar
- Carlos Pacheco, Shuvendu K Lahiri, and Thomas Ball. 2008. Finding errors in. net with feedback-directed random testing. In Proceedings of the 2008 international symposium on Software testing and analysis. 87--96.Google ScholarDigital Library
- Sebastiano Panichella, Annibale Panichella, Moritz Beller, Andy Zaidman, and Harald C Gall. 2016. The impact of test case summaries on bug fixing performance: An empirical investigation. In Proceedings of the 38th International Conference on Software Engineering. 547--558.Google ScholarDigital Library
- Nornadiah Mohd Razali, Yap Bee Wah, et al. 2011. Power comparisons of shapirowilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. Journal of statistical modeling and analytics 2, 1 (2011), 21--33.Google Scholar
- José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2015. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In Proceedings of the 2015 international symposium on software testing and analysis. 338--349.Google ScholarDigital Library
- José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2015. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In Proceedings of the 2015 international symposium on software testing and analysis. 338--349.Google ScholarDigital Library
- Sina Shamshiri, José Miguel Rojas, Juan Pablo Galeotti, Neil Walkinshaw, and Gordon Fraser. 2018. How do automatically generated unit tests influence software maintenance?. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 250--261.Google ScholarCross Ref
- Indy PSC Silva, Everton LG Alves, and Wilkerson L Andrade. 2017. Analyzing automatic test generation tools for refactoring validation. In 2017 IEEE/ACM 12th International Workshop on Automation of Software Testing (AST). IEEE, 38--44.Google ScholarDigital Library
- Indy PSC Silva, Everton LG Alves, and Patrícia DL Machado. 2018. Can automated test case generation cope with extract method validation?. In Proceedings of the XXXII Brazilian Symposium on Software Engineering. 152--161.Google ScholarDigital Library
- Gustavo Soares, Rohit Gheyi, Dalton Serey, and Tiago Massoni. 2010. Making program refactoring safer. IEEE software 27, 4 (2010), 52--57.Google ScholarDigital Library
Index Terms
- Manually Written or Generated Tests?: A Study with Developers and Maintenance Tasks
Recommendations
An empirical study of automatically-generated tests from the perspective of test smells
SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software EngineeringDeveloping test code can be as or more expensive than developing production code. Commonly, developers use automated unit test generators to speed up software testing. The purpose of such tools is to shorten production time without decreasing code ...
Compressing Automatically Generated Unit Test Suites Through Test Parameterization
Fundamentals of Software EngineeringAbstractTest maintenance has recently gained increasing attention from the software testing research community. When using automated unit test generation tools, the tests are typically created by random test generation or search-based algorithms. Although ...
Analyzing automatic test generation tools for refactoring validation
AST '17: Proceedings of the 12th International Workshop on Automation of Software TestingRefactoring edits are very common during agile development. Due to their inherent complexity, refactorings are know to be error prone. In this sense, refactoring edits require validation to check whether no behavior change was introduced. A valid way ...
Comments