research-article

Manually Written or Generated Tests?: A Study with Developers and Maintenance Tasks

Authors:
Wesley B. R. Herculano

Federal University of Campina Grande

Federal University of Campina Grande
View Profile

,
Melina Mongiovi

Federal University of Campina Grande

Federal University of Campina Grande
View Profile

,
Everton L. G. Alves

Federal University of Campina Grande

Federal University of Campina Grande
View Profile

SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software EngineeringOctober 2020Pages 273–282https://doi.org/10.1145/3422392.3422416

Published:21 December 2020Publication History

SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software Engineering

Pages 273–282

ABSTRACT

Software testing is an important part of software development. In addition to finding faults in the code, good tests should help developers to correct faults and be easily updated in case of modifications due to code changes. Automatically generated tests can save time and may lead to higher code coverage. However, these tests may be less readable and not based on realistic scenarios. Few pieces of research have been done to evaluate whether automatically generated tests are maintainable and support developers when maintaining code. To further investigate this issue, we perform an empirical study with 20 real developers to compare how they perform maintenance tasks with automatically generated (Evosuite or Randoop) and manually-written test cases. Our results indicate that automatically generated tests can be a great help for identifying faults during maintenance. Also, we found that all strategies were similar at helping to produce correct bug fixes and with similar efficiency. Therefore, we can say that developers may integrate generated test suites into the project at any stage.

References

Andrea Arcuri and Lionel Briand. 2014. A hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability 24, 3 (2014), 219--250.Google ScholarDigital Library
Sebastian Bauersfeld, Tanja EJ Vos, and Kiran Lakhotia. 2013. Unit testing tool competitions-lessons learned. In International Workshop on Future Internet Testing. Springer, 75--94.Google Scholar
Jeffrey Carver, Letizia Jaccheri, Sandro Morasca, and Forrest Shull. 2004. Issues in using students in empirical studies in software engineering education. In Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No. 03EX717). IEEE, 239--249.Google Scholar
Christoph Csallner and Yannis Smaragdakis. 2004. JCrasher: an automatic robustness tester for Java. Software: Practice and Experience 34, 11 (2004), 1025--1050.Google ScholarDigital Library
Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer. 2015. Modeling readability to improve unit tests. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 107--118.Google ScholarDigital Library
Ronald A Fisher. 1922. On the interpretation of &khgr; 2 from contingency tables, and the calculation of P. Journal of the Royal Statistical Society 85, 1 (1922), 87--94.Google ScholarCross Ref
Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416--419.Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2013. Evosuite: On the challenges of test case generation in the real world. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation. IEEE, 362--369.Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2014. A large-scale evaluation of automated unit test generation using evosuite. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 2 (2014), 1--42.Google ScholarDigital Library
Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2015. Does automated unit test generation really help software testers? a controlled empirical study. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 4 (2015), 1--49.Google ScholarDigital Library
Robert L Glass. 2002. Software engineering: facts and fallacies. (2002).Google Scholar
Mary Jean Harrold and ML Souffa. 1988. An incremental approach to unit testing during maintenance. In Proceedings. Conference on Software Maintenance, 1988. IEEE, 362--367.Google ScholarCross Ref
Fitsum Kifetew, Xavier Devroey, and Urko Rueda. 2019. Java unit testing tool competition-seventh round. In 2019 IEEE/ACM 12th International Workshop on Search-Based Software Testing (SBST). IEEE, 15--20.Google ScholarDigital Library
Barbara A Kitchenham, Shari Lawrence Pfleeger, Lesley M Pickard, Peter W Jones, David C. Hoaglin, Khaled El Emam, and Jarrett Rosenberg. 2002. Preliminary guidelines for empirical research in software engineering. IEEE Transactions on software engineering 28, 8 (2002), 721--734.Google ScholarDigital Library
Urko Rueda Molina, Fitsum Kifetew, and Annibale Panichella. 2018. Java unit testing tool competition-sixth round. In 2018 IEEE/ACM 11th International Workshop on Search-Based Software Testing (SBST). IEEE, 22--29.Google Scholar
Melina Mongiovi. 2016. Scaling testing of refactoring engines. In Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity. 15--17.Google ScholarDigital Library
Melina Mongiovi, Rohit Gheyi, Gustavo Soares, Márcio Ribeiro, Paulo Borba, and Leopoldo Teixeira. 2017. Detecting overly strong preconditions in refactoring engines. IEEE Transactions on Software Engineering 44, 5 (2017), 429--452.Google ScholarCross Ref
Glenford J Myers, Corey Sandler, and Tom Badgett. 2011. The art of software testing. John Wiley & Sons.Google ScholarDigital Library
Akira K Onoma, Wei-Tek Tsai, Mustafa Poonawala, and Hiroshi Suganuma. 1998. Regression testing in an industrial environment. Commun. ACM 41, 5 (1998), 81--86.Google ScholarDigital Library
Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815--816.Google Scholar
Carlos Pacheco, Shuvendu K Lahiri, and Thomas Ball. 2008. Finding errors in. net with feedback-directed random testing. In Proceedings of the 2008 international symposium on Software testing and analysis. 87--96.Google ScholarDigital Library
Sebastiano Panichella, Annibale Panichella, Moritz Beller, Andy Zaidman, and Harald C Gall. 2016. The impact of test case summaries on bug fixing performance: An empirical investigation. In Proceedings of the 38th International Conference on Software Engineering. 547--558.Google ScholarDigital Library
Nornadiah Mohd Razali, Yap Bee Wah, et al. 2011. Power comparisons of shapirowilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. Journal of statistical modeling and analytics 2, 1 (2011), 21--33.Google Scholar
José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2015. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In Proceedings of the 2015 international symposium on software testing and analysis. 338--349.Google ScholarDigital Library
José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2015. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In Proceedings of the 2015 international symposium on software testing and analysis. 338--349.Google ScholarDigital Library
Sina Shamshiri, José Miguel Rojas, Juan Pablo Galeotti, Neil Walkinshaw, and Gordon Fraser. 2018. How do automatically generated unit tests influence software maintenance?. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 250--261.Google ScholarCross Ref
Indy PSC Silva, Everton LG Alves, and Wilkerson L Andrade. 2017. Analyzing automatic test generation tools for refactoring validation. In 2017 IEEE/ACM 12th International Workshop on Automation of Software Testing (AST). IEEE, 38--44.Google ScholarDigital Library
Indy PSC Silva, Everton LG Alves, and Patrícia DL Machado. 2018. Can automated test case generation cope with extract method validation?. In Proceedings of the XXXII Brazilian Symposium on Software Engineering. 152--161.Google ScholarDigital Library
Gustavo Soares, Rohit Gheyi, Dalton Serey, and Tiago Massoni. 2010. Making program refactoring safer. IEEE software 27, 4 (2010), 52--57.Google ScholarDigital Library

Index Terms

Manually Written or Generated Tests?: A Study with Developers and Maintenance Tasks
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation

Recommendations

An empirical study of automatically-generated tests from the perspective of test smells
SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software Engineering

Developing test code can be as or more expensive than developing production code. Commonly, developers use automated unit test generators to speed up software testing. The purpose of such tools is to shorten production time without decreasing code ...
Read More
Compressing Automatically Generated Unit Test Suites Through Test Parameterization
Fundamentals of Software Engineering
Abstract
Test maintenance has recently gained increasing attention from the software testing research community. When using automated unit test generation tools, the tests are typically created by random test generation or search-based algorithms. Although ...
Read More
Analyzing automatic test generation tools for refactoring validation
AST '17: Proceedings of the 12th International Workshop on Automation of Software Testing

Refactoring edits are very common during agile development. Due to their inherent complexity, refactorings are know to be error prone. In this sense, refactoring edits require validation to check whether no behavior change was introduced. A valid way ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software Engineering
October 2020
901 pages
ISBN:9781450387538
DOI:10.1145/3422392
General Chairs:
Everton Cavalcante
UFRN, Brazil
,
Francisco Dantas
UERN, Brazil
,
Thais Batista
UFRN, Brazil
Copyright © 2020 ACM
© 2020 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 December 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
evosuite
generated tests
maintenance
randoop
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate147of427submissions,34%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 77
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Manually Written or Generated Tests?: A Study with Developers and Maintenance Tasks

SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

An empirical study of automatically-generated tests from the perspective of test smells

Compressing Automatically Generated Unit Test Suites Through Test Parameterization

Analyzing automatic test generation tools for refactoring validation