skip to main content
10.1145/3422392.3422416acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbesConference Proceedingsconference-collections
research-article

Manually Written or Generated Tests?: A Study with Developers and Maintenance Tasks

Published:21 December 2020Publication History

ABSTRACT

Software testing is an important part of software development. In addition to finding faults in the code, good tests should help developers to correct faults and be easily updated in case of modifications due to code changes. Automatically generated tests can save time and may lead to higher code coverage. However, these tests may be less readable and not based on realistic scenarios. Few pieces of research have been done to evaluate whether automatically generated tests are maintainable and support developers when maintaining code. To further investigate this issue, we perform an empirical study with 20 real developers to compare how they perform maintenance tasks with automatically generated (Evosuite or Randoop) and manually-written test cases. Our results indicate that automatically generated tests can be a great help for identifying faults during maintenance. Also, we found that all strategies were similar at helping to produce correct bug fixes and with similar efficiency. Therefore, we can say that developers may integrate generated test suites into the project at any stage.

References

  1. Andrea Arcuri and Lionel Briand. 2014. A hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability 24, 3 (2014), 219--250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sebastian Bauersfeld, Tanja EJ Vos, and Kiran Lakhotia. 2013. Unit testing tool competitions-lessons learned. In International Workshop on Future Internet Testing. Springer, 75--94.Google ScholarGoogle Scholar
  3. Jeffrey Carver, Letizia Jaccheri, Sandro Morasca, and Forrest Shull. 2004. Issues in using students in empirical studies in software engineering education. In Proceedings. 5th International Workshop on Enterprise Networking and Computing in Healthcare Industry (IEEE Cat. No. 03EX717). IEEE, 239--249.Google ScholarGoogle Scholar
  4. Christoph Csallner and Yannis Smaragdakis. 2004. JCrasher: an automatic robustness tester for Java. Software: Practice and Experience 34, 11 (2004), 1025--1050.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer. 2015. Modeling readability to improve unit tests. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 107--118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ronald A Fisher. 1922. On the interpretation of &khgr; 2 from contingency tables, and the calculation of P. Journal of the Royal Statistical Society 85, 1 (1922), 87--94.Google ScholarGoogle ScholarCross RefCross Ref
  7. Gordon Fraser and Andrea Arcuri. 2011. Evosuite: automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 416--419.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gordon Fraser and Andrea Arcuri. 2013. Evosuite: On the challenges of test case generation in the real world. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation. IEEE, 362--369.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gordon Fraser and Andrea Arcuri. 2014. A large-scale evaluation of automated unit test generation using evosuite. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 2 (2014), 1--42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2015. Does automated unit test generation really help software testers? a controlled empirical study. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 4 (2015), 1--49.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Robert L Glass. 2002. Software engineering: facts and fallacies. (2002).Google ScholarGoogle Scholar
  12. Mary Jean Harrold and ML Souffa. 1988. An incremental approach to unit testing during maintenance. In Proceedings. Conference on Software Maintenance, 1988. IEEE, 362--367.Google ScholarGoogle ScholarCross RefCross Ref
  13. Fitsum Kifetew, Xavier Devroey, and Urko Rueda. 2019. Java unit testing tool competition-seventh round. In 2019 IEEE/ACM 12th International Workshop on Search-Based Software Testing (SBST). IEEE, 15--20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Barbara A Kitchenham, Shari Lawrence Pfleeger, Lesley M Pickard, Peter W Jones, David C. Hoaglin, Khaled El Emam, and Jarrett Rosenberg. 2002. Preliminary guidelines for empirical research in software engineering. IEEE Transactions on software engineering 28, 8 (2002), 721--734.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Urko Rueda Molina, Fitsum Kifetew, and Annibale Panichella. 2018. Java unit testing tool competition-sixth round. In 2018 IEEE/ACM 11th International Workshop on Search-Based Software Testing (SBST). IEEE, 22--29.Google ScholarGoogle Scholar
  16. Melina Mongiovi. 2016. Scaling testing of refactoring engines. In Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity. 15--17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Melina Mongiovi, Rohit Gheyi, Gustavo Soares, Márcio Ribeiro, Paulo Borba, and Leopoldo Teixeira. 2017. Detecting overly strong preconditions in refactoring engines. IEEE Transactions on Software Engineering 44, 5 (2017), 429--452.Google ScholarGoogle ScholarCross RefCross Ref
  18. Glenford J Myers, Corey Sandler, and Tom Badgett. 2011. The art of software testing. John Wiley & Sons.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Akira K Onoma, Wei-Tek Tsai, Mustafa Poonawala, and Hiroshi Suganuma. 1998. Regression testing in an industrial environment. Commun. ACM 41, 5 (1998), 81--86.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815--816.Google ScholarGoogle Scholar
  21. Carlos Pacheco, Shuvendu K Lahiri, and Thomas Ball. 2008. Finding errors in. net with feedback-directed random testing. In Proceedings of the 2008 international symposium on Software testing and analysis. 87--96.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sebastiano Panichella, Annibale Panichella, Moritz Beller, Andy Zaidman, and Harald C Gall. 2016. The impact of test case summaries on bug fixing performance: An empirical investigation. In Proceedings of the 38th International Conference on Software Engineering. 547--558.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nornadiah Mohd Razali, Yap Bee Wah, et al. 2011. Power comparisons of shapirowilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. Journal of statistical modeling and analytics 2, 1 (2011), 21--33.Google ScholarGoogle Scholar
  24. José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2015. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In Proceedings of the 2015 international symposium on software testing and analysis. 338--349.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. José Miguel Rojas, Gordon Fraser, and Andrea Arcuri. 2015. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In Proceedings of the 2015 international symposium on software testing and analysis. 338--349.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sina Shamshiri, José Miguel Rojas, Juan Pablo Galeotti, Neil Walkinshaw, and Gordon Fraser. 2018. How do automatically generated unit tests influence software maintenance?. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 250--261.Google ScholarGoogle ScholarCross RefCross Ref
  27. Indy PSC Silva, Everton LG Alves, and Wilkerson L Andrade. 2017. Analyzing automatic test generation tools for refactoring validation. In 2017 IEEE/ACM 12th International Workshop on Automation of Software Testing (AST). IEEE, 38--44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Indy PSC Silva, Everton LG Alves, and Patrícia DL Machado. 2018. Can automated test case generation cope with extract method validation?. In Proceedings of the XXXII Brazilian Symposium on Software Engineering. 152--161.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Gustavo Soares, Rohit Gheyi, Dalton Serey, and Tiago Massoni. 2010. Making program refactoring safer. IEEE software 27, 4 (2010), 52--57.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Manually Written or Generated Tests?: A Study with Developers and Maintenance Tasks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      SBES '20: Proceedings of the XXXIV Brazilian Symposium on Software Engineering
      October 2020
      901 pages
      ISBN:9781450387538
      DOI:10.1145/3422392

      Copyright © 2020 ACM

      © 2020 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 December 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate147of427submissions,34%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader