Naming Practices in Object-oriented Programming: An Empirical Study
Keywords:Naming Identifiers, Program Comprehension, Mining Software Repositories
Currently, research indicates that comprehending code takes up far more developer time than writing code. Given that most modern programming languages place little to no limitations on identifier names, and so developers are allowed to choose identifier names at their own discretion, one key aspect of code comprehension is the naming of identifiers. Research in naming identifiers shows that informative names are crucial to improving the readability and maintainability of programs: essentially, intention-revealing names make code easier to understand and act as a basic form of documentation. Poorly named identifiers tend to hurt the comprehensibility and maintainability of software systems. However, most computer science curricula emphasize programming concepts and language syntax over naming guidelines and conventions. Consequently, programmers lack knowledge about naming practices. This article is an extension of our previous study on naming practices. Previously, we set out to explore naming practices of Java programmers. To this end, we analyzed 1,421,607 identifier names (i.e., attributes, parameters, and variables names) from 40 open-source Java projects and categorized these names into eight naming practices. As a follow-up study to further investigate naming practices, we examined 40 open-source C++ projects and categorized 1,181,774 identifier names according to the previously mentioned eight naming practices. We examined the occurrence and prevalence of these categories across C++ and Java projects and our results also highlight in which contexts identifiers following each naming practice tend to appear more regularly. Finally, we also conducted an online survey questionnaire with 52 software developers to gain insight from the industry. All in all, we believe the results based on the analysis of 2,603,381 identifier names can be helpful to enhance programmers’ awareness and contribute to improving educational materials and code review methods.
Allamanis, M., Barr, E. T., Bird, C., and Sutton, C. (2014). Learning natural coding conventions. In International Symposium on Foundations of Software Engineering.
Alsuhaibani, R. S., Newman, C. D., Decker, M. J., Collard, M. L., and Maletic, J. I. (2021). On the naming of methods: A survey of professional developers. In International Conference on Software Engineering.
Arnaoudova, V., Di Penta, M., and Antoniol, G. (2016). Linguistic antipatterns: What they are and how developers perceive them. Empirical Software Engineering, 21(1):104–158.
Avidan, E. and Feitelson, D. G. (2017). Effects of variable names on comprehension: An empirical study. In 25th International Conference on Program Comprehension.
Beniamini, G., Gingichashvili, S., Orbach, A. K., and Feitelson, D. G. (2017). Meaningful identifier names: The case of single-letter variables. In International Conference on Program Comprehension, pages 45–54.
Brooks, R. (1983). Towards a theory of the comprehension of computer programs. International Journal of Man-Machine Studies, 18(6):543–554.
Brown, W. H., Malveau, R. C., McCormick, H. W. S., and Mowbray, T. J. (1998). AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. John Wiley & Sons, Inc., USA, 1st edition.
Butler, S., Wermelinger, M., Yu, Y., and Sharp, H. (2010). Exploring the influence of identifier names on code quality: An empirical study. In 2010 14th European Conference on Software Maintenance and Reengineering, pages 156–165. IEEE.
Caprile, B. and Tonella, P. (2000). Restructuring program identifier names. In icsm, pages 97–107.
Charitsis, C., Piech, C., and Mitchell, J. (2021). Assessing function names and quantifying the relationship between identifiers and their functionality to improve them. In Conference on Learning@ Scale.
Collard, M. L., Decker, M. J., and Maletic, J. I. (2013). srcml: An infrastructure for the exploration, analysis, and manipulation of source code: A tool demonstration. In 2013 IEEE International Conference on Software Maintenance, pages 516–519. IEEE.
Deissenboeck, F. and Pizka, M. (2006). Concise and consistent naming. Software Quality Journal, 14(3):261–282.
DiLeo, C. (2019). Clean ruby.
dos Santos, R. M. and Gerosa, M. A. (2018). Impacts of coding practices on readability. In Internation Conference on Program Comprehension.
Fakhoury, S., Ma, Y., Arnaoudova, V., and Adesope, O. (2018). The effect of poor source code lexicon and readability on developers’ cognitive load. In International Conference on Program Comprehension.
Feitelson, D., Mizrahi, A., Noy, N., Shabat, A. B., Eliyahu, O., and Sheffer, R. (2020). How developers choose names. IEEE Transactions on Software Engineering.
Gresta, R. and Cirilo, E. (2020). Contextual similarity among identifier names: An empirical study. In Workshop de Visualização, Evolução e Manutenção de Software, pages 49–56. SBC.
Gresta, R., Durelli, V., and Cirilo, E. (2021). Naming Practices in Java Projects: An Empirical Study. In XX Brazilian Symposium on Software Quality, pages 1–10. ACM.
Hofmeister, J., Siegmund, J., and Holt, D. V. (2017). Shorter identifier names take longer to comprehend. In 2017 IEEE 24th International conference on software analysis, evolution and reengineering (SANER), pages 217–227. IEEE.
Host, E. W. and Ostvold, B. M. (2007). The programmer’s lexicon, volume i: The verbs. In International Working Conference on Source Code Analysis and Manipulation.
Isobe, Y. and Tamada, H. (2018). Are identifier renaming methods secure? In International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.
Jiang, L., Liu, H., and Jiang, H. (2019). Machine learning based recommendation of method names: How far are we. In International Conference on Automated Software Engineering.
Kawamoto, K. and Mizuno, O. (2012). Predicting fault-prone modules using the length of identifiers. In 2012 Fourth International Workshop on Empirical Software Engineering in Practice, pages 30–34. IEEE.
Kernighan, B. W. and Pike, R. (1999). The Practice of Programming. Addison-Wesley Longman Publishing Co., Inc.
Lawrie, D., Feild, H., and Binkley, D. (2007a). Quantifying identifier quality: an analysis of trends. Empirical Software Engineering, 12(4):359–388.
Lawrie, D., Morrell, C., and Feild, H. (2007b). Effective identifier names for comprehension and memory. Innovations Syst Softw Eng, 3(1):303–318.
Lawrie, D., Morrell, C., Feild, H., and Binkley, D. (2006). What’s in a name? a study of identifiers. In 14th IEEE International Conference on Program Comprehension.
Marcus, A., Sergeyev, A., Rajlich, V., and Maletic, J. I. (2004). An information retrieval approach to concept location in source code. In 11th working conference on reverse engineering, pages 214–223. IEEE.
Martin, R. C. (2008). Clean code: A handbook of agile software craftsmanship.
Nyamawe, A. S., Bakhti, K., and Sandiwarno, S. (2021). Identifying rename refactoring opportunities based on feature requests. International Journal of Computers and Applications, pages 1–9.
Oliveira, D., Bruno, R., Madeiral, F., and Castor, F. (2020). Evaluating code readability and legibility: An examination of human-centric studies. In International Conference on Software Maintenance and Evolution.
Peruma, A., Mkaouer, M. W., Decker, M. J., and Newman, C. D. (2018). An empirical investigation of how and why developers rename identifiers. In 2nd International Workshop on Refactoring.
Peruma, A., Mkaouer, M. W., Decker, M. J., and Newman, C. D. (2019). Contextualizing rename decisions using refactorings and commit messages. In International Working Conference on Source Code Analysis and Manipulation.
Ratiu, D. and Deissenboeck, F. (2006). Programs are knowledge bases. In 14th IEEE International Conference on Program Comprehension (ICPC’06), pages 79–83. IEEE.
Scalabrino, S., Bavota, G., Vendome, C., Linares-Vásquez, M., Poshyvanyk, D., and Oliveto, R. (2017). Automatically assessing code understandability: How far are we? In International Conference on Automated Software Engineering.
Schankin, A., Berger, A., Holt, D. V., Hofmeister, J. C., Riedel, T., and Beigl, M. (2018). Descriptive compound identifier names improve source code comprehension. In International Conference on Program Comprehension.
Swidan, A., Serebrenik, A., and Hermans, F. (2017). How do scratch programmers name variables and procedures? In International Working Conference on Source Code Analysis and Manipulation (SCAM), pages 51–60.
Takang, A. A., Grubb, P. A., and Macredie, R. D. (1996). The effects of comments and identifier names on program comprehensibility: an experimental investigation. J. Prog. Lang., 4(3):143–167.
Tofte, M. and Talpin, J.-P. (1997). Region-based memory management. Information and computation, 132(2):109–176.
Wainakh, Y., Rauf, M., and Pradel, M. (2021). Idbench: Evaluating semantic representations of identifier names in source code. In International Conference on Software Engineering.
How to Cite
Copyright (c) 2023 Remo Gresta, Vinicius Durelli, Elder Cirilo
This work is licensed under a Creative Commons Attribution 4.0 International License.