Assessing Python Style Guides: An Eye-Tracking Study with Novice Developers
Resumo
The incorporation and adaptation of style guides play an essential role in software development, influencing code formatting, naming conventions, and structure to enhance readability and simplify maintenance. However, many of these guides often lack empirical studies to validate their recommendations. Previous studies have examined the impact of code styles on developer performance, concluding that some styles have a negative impact on code readability. However, there is a need for more studies that assess other perspectives and the combination of these perspectives on a common basis through experiments. This study aimed to investigate, through eye-tracking, the impact of guidelines in style guides, with a special focus on the PEP8 guide in Python, recognized for its best practices. We conducted a controlled experiment with 32 Python novices, measuring time, the number of attempts, and visual effort through eye-tracking, using fixation duration, fixation count, and regression count for four PEP8 recommendations. Additionally, we conducted interviews to explore the subjects’ difficulties and preferences with the programs. The results highlighted that not following the PEP8 Line Break after an Operator guideline increased the eye regression count by 70% in the code snippet where the standard should have been applied. Most subjects preferred the version that adhered to the PEP8 guideline, and some found the left-aligned organization of operators easier to understand. The other evaluated guidelines revealed other interesting nuances, such as the True Comparison, which negatively impacted eye metrics for the PEP8 standard, although subjects preferred the PEP8 suggestion. We recommend practitioners selecting guidelines supported by experimental evaluations.
Referências
Victor Basili, G. Caldiera, and H. Rombach. 1994. The Goal Question Metric Approach. Encyclopedia of Software Engineering (1994), 528–532.
Jennifer Bauer, Janet Siegmund, Norman Peitek, Johannes C. Hofmeister, and Sven Apel. 2019. Indentation: Aimply a Matter of Style or Support for Program Comprehension?. In Proceedings of the International Conference on Program Comprehension (ICPC’19). IEEE, 154–164.
Dave Binkley, Marcia Davis, Dawn Lawrie, Jonathan Maletic, Christopher Morrell, and Bonita Sharif. 2013. The impact of Identifier Style on Effort and Comprehension. Empirical Software Engineering 18, 2 (2013), 219–276.
George Box, J. Stuart Hunter, and William G. Hunter. 2005. Statistics for Experimenters. Wiley-Interscience.
Raymond Buse andWestleyWeimer. 2009. Learning a Metric for Code Readability. In Proceedings of the International Symposium on Software Testing and Analysis. 465–475.
Raymond P. L. Buse and Westley R. Weimer. 2008. A Metric for Software Readability. In Proceedings of the 2008 International Symposium on Software Testing and Analysis (ISSTA’08). ACM Press, 121–130.
Teresa Busjahn, Carsten Schulte, Sascha Tamm, and Roman Bednarik. 2015. Eye Movements in Programming Education II: Analyzing the Novice’s Gaze. In Proceedings of the Conference on Computing Education (ICER’15).
Martha Crosby, Jean Scholtz, and Susan Wiedenbeck. 2002. The Roles Beacons Play in Comprehension for Novice and Expert Programmers.. In Workshop of the Psychology of Programming Interest Group (PPIG’02). 5.
José Aldo Silva da Costa and Rohit Gheyi. 2023. Evaluating the Code Comprehension of Novices with Eye Tracking. In Concurso de Teses e Dissertações em Engenharia de Software (CTD-ES).
José Aldo Silva da Costa, Rohit Gheyi, Fernando Castor, Pablo Roberto Fernandes de Oliveira, Márcio Ribeiro, and Baldoino Fonseca. 2023. Seeing Confusion through a New Lens: on the Impact of Atoms of Confusion on Novices’ Code Comprehension. Empirical Software Engineering 28, 4 (2023), 81.
José Aldo Silva da Costa, Rohit Gheyi, Márcio Ribeiro, Sven Apel, Vander Alves, Baldoino Fonseca, Flávio Medeiros, and Alessandro Garcia. 2021. Evaluating Refactorings for Disciplining #ifdef Annotations: An Eye Tracking Study with Novices. Empirical Software Engineering 26, 5 (2021), 1–35.
Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer. 2015. Modeling readability to improve unit tests. In Proceedings of the Foundations of Software Engineering. 107–118.
Subhasish Dasgupta and Sara Hooshangi. 2017. Code Quality: Examining the Efficacy of Automated Tools. In Americas Conference on Information Systems (AMCIS’17).
Jorgy Rady de Almeida, João Batista Camargo, Bruno Abrantes Basseto, and Sérgio Miranda Paz. 2003. Best Practices in Code Inspection for Safety-critical Software. IEEE Software 20, 3 (2003), 56–63.
Benedito de Oliveira, Márcio Ribeiro, José Aldo Silva da Costa, Rohit Gheyi, Guilherme Amaral, Rafael de Mello, Anderson Oliveira, Alessandro Garcia, Rodrigo Bonifácio, and Baldoino Fonseca. 2020. Atoms of Confusion: The Eyes Do Not Lie. In Proceedings of the Brazilian Symposium on Software Engineering (SBES’20). 243–252.
Rodrigo Magalhães dos Santos and Marco Aurélio Gerosa. 2018. Impacts of Coding Practices on Readability. In Proceedings of the International Conference on Program Comprehension (ICPC’18). 277–285.
Sarah Fakhoury, Devjeet Roy, Yuzhan Ma, Venera Arnaoudova, and Olusola Adesope. 2020. Measuring the impact of lexical and structural inconsistencies on developers’ cognitive load during bug localization. Empirical Software Engineering 25 (2020), 2140–2178.
Davide Falessi, Natalia Juristo, Claes Wohlin, Burak Turhan, Jürgen Münch, Andreas Jedlitschka, and Markku Oivo. 2018. Empirical software engineering experts on the use of students and professionals in experiments. Empirical Software Engineering 23, 1 (2018), 452–489.
Google. 2024. Google Python Style Guide. [link]
D. Lawrie, C. Morrell, H. Feild, and D. Binkley. 2006. What’s in a name? A study of identifiers. In 14th IEEE International Conference on Program Comprehension (ICPC’06). IEEE, 3–12.
Dawn Lawrie, Christopher Morrell, Henry Feild, and David Binkley. 2007. Effective Identifier Names for Comprehension and Memory. Innovations in Systems and Software Engineering 3 (2007), 303–318.
Microsoft. 2024. Formatting Python Code. [link]
Marcus Nyström and Kenneth Holmqvist. 2010. An adaptive algorithm for fixation, saccade, and glissade detection in eye-tracking data. Behavior research methods 42, 1 (2010), 188–204.
Delano Oliveira, Reydne Bruno, Fernanda Madeiral, and Fernando Castor. 2020. Evaluating Code Readability and Legibility: An Examination of Human-centric Studies. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME’20). 348–359.
D. Posnett, A. Hindle, and P. Devanbu. 2011. A Simpler Model of Software Readability. In Proceedings of the 8th Working Conference on Mining Software Repositories (MSR’11). ACM Press, 73–82.
Keith Rayner. 1998. Eye Movements in Reading and Information Processing: 20 Years of Research. Psychological Bulletin 124, 3 (1998), 372.
Sharafi Zohreh; Bonita Sharif; Yann-Gaël Guéhéneuc; Andrew Begel; Bednarik Roman. 2020. A practical guide on conducting eye tracking studies in software engineering. Empirical Software Engineering 25 (2020), 3128–3174.
Iflaah Salman, Ayse Tosun Misirli, and Natalia Juristo Juzgado. 2015. Are Students Representatives of Professionals in Software Engineering Experiments?. In 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1. IEEE Computer Society, 666–676.
Dario Salvucci and Joseph Goldberg. 2000. Identifying Fixations and Saccades in Eye-tracking Protocols. In Proceedings of the Symposium on Eye Tracking Research & Applications (ETRA’00). 71–78.
Reydne Bruno dos Santos. 2021. Um Estudo sobre Definição e Avaliação da Readability e Legibility do Código Fonte. Master’s thesis. Universidade Federal de Pernambuco.
Andrea Schankin, Annika Berger, Daniel V. Holt, Johannes C. Hofmeister, Till Riedel, and Michael Beigl. 2018. Descriptive Compound Identifier Names Improve Source Code Comprehension. In Proceedings of the International Conference on Program Comprehension (ICPC’18). 31–40.
Zohreh Sharafi, Yu Huang, Kevin Leach, and Westley Weimer. 2021. Toward an Objective Measure of Developers’ Cognitive Activities. ACM Transactions on Software Engineering and Methodology 30, 3 (2021), 1–40.
Zohreh Sharafi, Zéphyrin Soh, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. 2012. Women and Men—Different but Equal: On the Impact of Identifier Style on Source Code Reading. In Proceedings of the International Conference on Program Comprehension (ICPC’12). IEEE, 27–36.
Bonita Sharif and Jonathan Maletic. 2010. An Eye Tracking Study on Camelcase and Under_score Identifier Styles. In Proceedings of the International Conference on Program Comprehension (ICPC’10). IEEE, 196–205.
Andreas Stefik and Susanna Siebert. 2013. An Empirical Investigation into Programming Language Syntax. ACM Transactions on Computing Education (TOCE’13) 13, 4 (2013), 1–40.
Guido Van Rossum, Barry Warsaw, and Nick Coghlan. 2001. PEP8–StyleGuide for Python Code. Python.org 1565 (2001), 28.