ABSTRACT
Software developers can benefit from machine learning solutions to predict bugs. Machine learning solutions usually require a lot of data to train a model in order to achieve reliable results. In this context, developers use bug-seeding approaches to generate synthetic bugs, which should be similar to human-made bugs. A recent state-of-the-art tool, called SemSeed, uses a semantics-aware bug seeding approach in order to hopefully achieve more realistic bugs. In this study, we report on the investigation of SemSeed’s efficacy. We create a survey that shows developers a bug and asks whether it is a Real or Synthetic bug. We collected and analyzed the answers from 47 developers, and we show that SemSeed can be very accurate in seeding realistic bugs.
- Robert S Boyer, Bernard Elspas, and Karl N Levitt. 1975. SELECT—a formal system for testing and debugging programs by symbolic execution. ACM SigPlan Notices 10, 6 (1975), 234–245.Google ScholarDigital Library
- David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas Reps. 2017. The Care and Feeding of Wild-Caught Mutants. In ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 511–522.Google Scholar
- Juliet Corbin and Anselm Strauss. 2014. Basics of qualitative research: Techniques and procedures for developing grounded theory. Sage publications.Google Scholar
- Brendan Dolan-Gavitt, Patrick Hulin†, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, and Ryan Whelan. 2016. LAVA: Large-scale Automated Vulnerability Addition. In IEEE Symposium on Security and Privacy. IEEE, 110–121.Google Scholar
- António Fonseca and Hugo Faria. 2021. Adaptive Knowledge Assessment Using Advanced Concept Maps with Logic Branching Multiple-Choice Google Forms. ELearn 2021, Special Issue, Article 9 (oct 2021), 12 pages. https://doi.org/10.1145/3466623Google ScholarDigital Library
- Heidilyn Gamido and Marlon Gamido. [n. d.]. Comparative Review of the Features of Automated Software Testing Tools. ([n. d.]). https://www.researchgate.net/publication/335928031_Comparative_Review_of_the_Features_of_Automated_Software_Testing_ToolsGoogle Scholar
- Charaka Geethal. 2021. Training Automated Test Oracles to Identify Semantic Bugs. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 1051–1055.Google ScholarDigital Library
- Michael Grottke, Rivalino Matias, and Kishor S Trivedi. 2008. The fundamentals of software aging. In 2008 IEEE International conference on software reliability engineering workshops (ISSRE Wksp). Ieee, 1–6.Google ScholarCross Ref
- Shikai Guo, He Jiang, Zhihao Xu, Xiaochen Li, Zhilei Ren, Zhide Zhou, and Rong Chen. 2022. Detecting Simulink compiler bugs via controllable zombie blocks mutation. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1061–1072.Google ScholarDigital Library
- Jingxuan He, Luca Beurer-Kellner, and Martin Vechev. 2022. On Distribution Shift in Learning-based Bug Detectors. arXiv preprint arXiv:2204.10049 (2022).Google Scholar
- Mark C. Henderson, Lawrence M. Tierney Jr., and Gerald W. Smetana. 2012. The Patient History: An Evidence-Based Approach to Differential Diagnosis cover. McGraw-Hill Medical.Google Scholar
- Facebook inc.[n. d.]. fastText, Library for efficient text classification and representation learning. https://github.com/facebookresearch/fastText/.Google Scholar
- Charaka Geethal Kapugama, Van-Thuan Pham, Aldeida Aleti, and Marcel Böhme. 2022. Human-in-the-loop oracle learning for semantic bugs in string processing programs. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 215–226.Google ScholarDigital Library
- Barbara A Kitchenham and Shari Lawrence Pfleeger. 2002. Principles of survey research: part 3: constructing a survey instrument. ACM SIGSOFT Software Engineering Notes 27, 2 (2002), 20–24.Google ScholarDigital Library
- Kui Liu, Jingtang Zhang, Li Li, Anil Koyuncu, Dongsun Kim, Chunpeng Ge, Zhe Liu, Jacques Klein, and Tegawendé F. Bissyandé. 2023. Reliable Fix Patterns Inferred from Static Checkers for Automated Program Repair. ACM Trans. Softw. Eng. Methodol. (jan 2023). https://doi.org/10.1145/3579637 Just Accepted.Google ScholarDigital Library
- Fritz Madrona Ferran, Maricar Sison Prudente, and Socorro E. Aguja. 2021. Google Forms-Based Lesson Playlist: Examining Students’ Attitude Towards Its Use and Its Effect on Academic Performance. In 2021 12th International Conference on E-Education, E-Business, E-Management, and E-Learning (Tokyo, Japan) (IC4E 2021). Association for Computing Machinery, New York, NY, USA, 131–139. https://doi.org/10.1145/3450148.3450200Google ScholarDigital Library
- Many. [n. d.]. 150k Javascript Dataset. https://www.sri.inf.ethz.ch/js150Google Scholar
- Vinícius Martins, Camila Terra, Lucas Cordeiro Marques, Juliana Alves Pereira, Alessandro Garcia, Carlos Lucena, Bruno Feijó, and Antonio L. Furtado. 2023. A qualitative study with the SemSeed tool: Github page. https://github.com/SyntheticBug/SemSeedQualitativeStudy Last accessed 17 July 2023.Google Scholar
- Islam Md. Hasibul, Paul Rumpa, and Mondal Manishankar. 2023. Predicting Buggy Code Clones through Machine Learning. In Proceedings of the 32nd Annual International Conference on Computer Science and Software Engineering (Toronto, Canada) (CASCON ’22). IBM Corp., USA, 130–139.Google Scholar
- Shabnam Mirshokraie, Ali Mesbah, and Karthik Pattabiraman. 2013. Efficient JavaScript Mutation Testing. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation. 74–83.Google Scholar
- Charalambos Mitropoulos. 2019. Employing different program analysis methods to study bug evolution. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1202–1204.Google ScholarDigital Library
- Meetesh Nevendra and Pradeep Singh. 2022. A Survey of Software Defect Prediction Based on Deep Learning. Archives of Computational Methods in Engineering 29, 7 (2022), 5723–5748.Google ScholarCross Ref
- Jibesh Patra and Michael Pradel. [n. d.]. SemSeed’s Github page. https://github.com/sola-st/SemSeed/Google Scholar
- Jibesh Patra and Michael Pradel. 2021. Semantic bug seeding: a learning-based approach for creating realistic bugs. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 906–918.Google ScholarDigital Library
- Shari Lawrence Pfleeger and Barbara A Kitchenham. 2001. Principles of survey research: part 1: turning lemons into lemonade. ACM SIGSOFT Software Engineering Notes 26, 6, 16–18.Google ScholarDigital Library
- Gábor Szoke. 2019. Fighting Software Erosion with Automated Refactoring.Google Scholar
- Roberto Torres. 2021. Poor software quality cost businesses $2 trillion last year and put security at risk. https://www.ciodive.com/news/poor-software-quality-report-2020/593015/Google Scholar
- Oleksii Trekhleb. [n. d.]. JavaScript Algorithms and Data Structures. https://github.com/trekhleb/javascript-algorithmsGoogle Scholar
- Cody Tufano, Michele; Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. Learning How to Mutate Source Code from Bug-Fixes. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 301–312.Google ScholarCross Ref
Index Terms
- Analyzing a Semantics-Aware Bug Seeding Tool's Efficacy: A qualitative study with the SemSeed tool
Recommendations
BugLocalizer: integrated tool support for bug localization
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software EngineeringTo manage bugs that appear in a software, developers often make use of a bug tracking system such as Bugzilla. Users can report bugs that they encounter in such a system. Whenever a user reports a new bug report, developers need to read the summary and ...
BEE: a tool for structuring and analyzing bug reports
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringThis paper introduces BEE, a tool that automatically analyzes user-written bug reports and provides feedback to reporters and developers about the system’s observed behavior (OB), expected behavior (EB), and the steps to reproduce the bug (S2R). BEE ...
Semantic bug seeding: a learning-based approach for creating realistic bugs
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringWhen working on techniques to address the wide-spread problem of software bugs, one often faces the need for a large number of realistic bugs in real-world programs. Such bugs can either help evaluate an approach, e.g., in form of a bug benchmark or a ...
Comments