BOHR - A Tool for Identifying Atoms of Confusion in Java Code

  • Wendell Mendes UFC
  • Windson Viana UFC
  • Lincoln Rocha UFC

Abstract


The activity of understanding source code is fundamental in software development and maintenance. In this context, Atoms of Confusion (AC) emerge as the smallest portion of code capable of causing confusion to developers in this process. In this paper, BOHR - The Atoms of Confusion Hunter is presented, a tool that aims to: (i) assist the identification of AC in Java systems; (ii) provide reports on the prevalence of these AC; and (iii) provide an API for the development of new custom search engines for capturing new CA and also improvements in their identifications. In this first version, BOHR is able to detect 8 of the 13 types of AC pointed out by Langhout and Aniche. A preliminary evaluation of the BOHR was conducted on three real open source systems widely adopted by the Java software development community (Picasso, greenDAO and Socket.IO-client Java).  The initial results showed that BOHR was able to accurately detect the existing AC in these analyzed systems, correctly pointing out the AC snippets, as well as their type, class name it belonged to, and the line number of its occurrence. In total 105 CA were found, 37 in Picasso, 25 in greenDAO and 43 in the Java Socket.IO-client.

Keywords: Code Comprehension, Atoms of Confusion, Static Code Analysis

References

Shulamyt Ajami, Yonatan Woodbridge, and Dror G Feitelson. 2019. Syntax, predicates, idioms—what really affects code complexity? Empirical Software Engineering 24, 1 (2019), 287–328.

K.H. Bennett, V.T. Rajlich, and N. Wilde. 2002. Software Evolution and the Staged Model of the Software Lifecycle. Advances in Computers, Vol. 56. Elsevier, 1–54. https://doi.org/10.1016/S0065-2458(02)80003-1

Pierre Carbonnelle. 2021. PYPL PopularitY of Programming Language. https://pypl.github.io/PYPL.html

Fernando Castor. 2018. Identifying confusing code in Swift programs. In Proceedings of the VI CBSoft Workshop on Visualization, Evolution, and Maintenance. ACM.

Benedito de Oliveira, Márcio Ribeiro, José Aldo Silva da Costa, Rohit Gheyi, Guilherme Amaral, Rafael de Mello, Anderson Oliveira, Alessandro Garcia, Rodrigo Bonifácio, and Baldoino Fonseca. 2020. Atoms of Confusion: The Eyes Do Not Lie. In Proceedings of the 34th Brazilian Symposium on Software Engineering (Natal, Brazil) (SBES ’20). Association for Computing Machinery, New York, NY, USA, 243–252. https://doi.org/10.1145/3422392.3422437

Rodrigo Magalhães dos Santos and Marco Aurélio Gerosa. 2018. Impacts of Coding Practices on Readability. In Proceedings of the 26th Conference on Program Comprehension (Gothenburg, Sweden) (ICPC ’18). Association for Computing Machinery, New York, NY, USA, 277–285. https://doi.org/10.1145/3196321.3196342

Felipe Ebert, Fernando Castor, Nicole Novielli, and Alexander Serebrenik. 2017. Confusion Detection in Code Reviews. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). 549–553. https://doi.org/10.1109/ICSME.2017.40

Dan Gopstein, Anne-Laure Fayard, Sven Apel, and Justin Cappos. 2020. Thinking Aloud about Confusing Code: A Qualitative Investigation of Program Comprehension and Atoms of Confusion (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 605–616. https://doi.org/10.1145/3368089.3409714

Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K.-C. Yeh, and Justin Cappos. 2017. Understanding Misunderstandings in Source Code. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (Paderborn, Germany) (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA, 129–139. https://doi.org/10.1145/3106237.3106264

Dan Gopstein, Hongwei Henry Zhou, Phyllis Frankl, and Justin Cappos. 2018. Prevalence of Confusing Code in Software Projects: Atoms of Confusion in the Wild (MSR ’18). Association for Computing Machinery, New York, NY, USA, 281–291. https://doi.org/10.1145/3196398.3196432

Chris Langhout. 2020. Investigating the Perception and Effects of Misunderstandings in Java Code. Master’s thesis. Delft University of Technology.

Chris Langhout and Maurício Aniche. 2021. Atoms of Confusion in Java. arXiv:2103.05424 [cs.SE]

Roberto Minelli, Andrea Mocci, and Michele Lanza. 2015. I Know What You Did Last Summer - An Investigation of How Developers Spend Their Time. In 2015 IEEE 23rd International Conference on Program Comprehension. 25–35. https://doi.org/10.1109/ICPC.2015.12

Stephen O’Grady. 2021. The RedMonk Programming Language Rankings: January 2021. https://redmonk.com/sogrady/2021/03/01/language-rankings-1-21/

Michail Papamichail, Themistoklis Diamantopoulos, and Andreas Symeonidis. 2016. User-Perceived Source Code Quality Estimation Based on Static Analysis Metrics. In 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS). 100–107. https://doi.org/10.1109/QRS.2016.22

Renaud Pawlak, Martin Monperrus, Nicolas Petitprez, Carlos Noguera, and Lionel Seinturier. 2015. Spoon: A Library for Implementing Analyses and Transformations of Java Source Code. Software: Practice and Experience 46 (2015), 1155–1179. https://doi.org/10.1002/spe.2346

Gustavo Pinto, Weslley Torres, Benito Fernandes, Fernando Castor, and Roberto S.M. Barros. 2015. A large-scale study on the usage of Java’s concurrent programming constructs. Journal of Systems and Software 106 (2015), 59–81. https://doi.org/10.1016/j.jss.2015.04.064

Akond Rahman. 2018. Comprehension Effort and Programming Activities: Related? Or Not Related?. In Proceedings of the 15th International Conference on Mining Software Repositories (Gothenburg, Sweden) (MSR ’18). Association for Computing Machinery, New York, NY, USA, 66–69. https://doi.org/10.1145/3196398.3196470

Spencer Rugaber. 1995. Program comprehension. Encyclopedia of Computer Science and Technology 35, 20 (1995), 341–368.

The JUnit Team. 2021. JUnit 5. https://junit.org/junit5/

Xin Xia, Lingfeng Bao, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. 2018. Measuring Program Comprehension: A Large-Scale Field Study with Professionals. IEEE Transactions on Software Engineering 44, 10 (2018), 951–976. https://doi.org/10.1109/TSE.2017.2734091
Published
2021-09-27
MENDES, Wendell; VIANA, Windson; ROCHA, Lincoln. BOHR - A Tool for Identifying Atoms of Confusion in Java Code. In: WORKSHOP ON SOFTWARE VISUALIZATION, EVOLUTION AND MAINTENANCE (VEM), 9. , 2021, Joinville. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 41-45. DOI: https://doi.org/10.5753/vem.2021.17216.