Evaluating the Role of GPT-4o in Software Engineering Learning: An Experience Report
Abstract
Desde o surgimento dos GPTs no mercado, há uma demanda crescente por estudos que investiguem seu impacto em diversas áreas, incluindo a educação. Este estudo é um relato de experiência estruturado que explora o impacto de ferramentas baseadas em IA — especificamente o modelo GPT-4o via API da OpenAI — na aprendizagem de alunos em uma disciplina de Engenharia de Software. A ferramenta proposta foi projetada com system prompts específicos para reduzir a carga cognitiva dos alunos, permitindo que se concentrassem na tarefa pedagógica. Dados de uso de 28 participantes foram coletados e analisados utilizando métodos estatísticos e processamento de linguagem natural. Os resultados indicam que alunos sem experiência profissional escreveram prompts mais longos, enquanto aqueles com tal experiência tenderam a curtir mais interações. Prompts curtidos tinham maior probabilidade de conter código, e alunos com experiência profissional também se mostraram mais propensos a incluir trechos de código em suas consultas. Este estudo contribui ao propor uma metodologia estruturada para coletar, processar e analisar dados de interação com IAs generativas em contextos educacionais, ajudando a orientar futuras pesquisas empíricas na área.
References
Xavier Amatriain. 2023. Prompt Engineering 201: Advanced methods and toolkits. [link]. [link] Acessado em: 2 de dezembro de 2024.
Owura Asare, Meiyappan Nagappan, and N. Asokan. 2023. Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code? arXiv:2204.04741 [cs.SE]
Robert L. Atenstaedt. 2023. Word cloud analysis of Family Practice. Does the journal fulfil its editorial policy? Family Practice (2023). DOI: 10.1093/fampra/cmad020 Accessed via SciSpace.
H. M. Caseli and M. G. V. Nunes (Eds.). 2023. Processamento de Linguagem Natural: Conceitos, Técnicas e Aplicações em Português. BPLN. [link].
Mark Chen et al. 2021. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG]
Armin Moradi Dakhel, Vahid Majdinasab, Ashkan Nikanjam, Foutse Khomh, Michael C. Desmarais, and Zhen Ming (Jack) Jiang. 2023. GitHub Copilot AI Pair Programmer: Asset or Liability? Journal of Systems and Software 203 (2023), 111734. DOI: 10.1016/j.jss.2023.111734
Paul Denny, Viraj Kumar, and Nasser Giacaman. 2023. Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 1136–1142. DOI: 10.1145/3545945.3569823
Thomas Dohmke, Marco Iansiti, and Greg Richards. 2023. Sea change in software development: Economic and productivity analysis of the ai-powered developer lifecycle. arXiv preprint arXiv:2306.15033 (2023).
Cleriston Izidro dos ANJOS and Deise Juliana Francisco. 2021. Educação infantil e tecnologias digitais: reflexões em tempos de pandemia. Zero-a-seis 23, 2 (2021), 125–146.
Jean-Baptiste Döderlein, Mathieu Acher, Djamel Eddine Khelladi, and Benoit Combemale. 2022. Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic? (10 2022). [link]
S. Ezzini, S. Abualhaija, C. Arora, and M. Sabetzadeh. 2023. AI-Based Question Answering Assistance for Analyzing Natural-Language Requirements. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE Computer Society, Los Alamitos, CA, USA, 1277–1289. DOI: 10.1109/ICSE48619.2023.00113
D Méndez Fernández et al. 2017. Naming the pain in requirements engineering: Contemporary problems, causes, and effects in practice. Empirical software engineering 22 (2017), 2298–2338.
Stefan Feuerriegel, Jochen Hartmann, Christian Janiesch, and Patrick Zschech. 2024. Generative ai. Business & Information Systems Engineering 66, 1 (2024), 111–126.
Nathalia Mendes Gerotti Franco and Leandro Innocentini Lopes de Faria. 2019. Colaboração científica intraorganizacional: análise de redes por coocorrência de palavras-chave. Em questão (2019), 87–110.
E. Frankford, C. Sauerwein, P. Bassner, S. Krusche, and R. Breu. 2024. AI-Tutoring in Software Engineering Education: Experiences with Large Language Models in Programming Assessments. [link]
GitHub. 2023. Sobre o GitHub Copilot for Individuals. [link]
Florian Heimerl, Steffen Lohmann, Simon Lange, and Thomas Ertl. 2014. Word Cloud Explorer: Text Analytics Based on Word Clouds. In Proceedings of the 47th Hawaii International Conference on System Sciences (HICSS). DOI: 10.1109/HICSS.2014.231 Accessed via SciSpace.
S. Imai. 2022. Is GitHub Copilot a Substitute for Human Pair-Programming? An Empirical Study. In 2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 319–321. DOI: 10.1145/3510454.3522684
H. Jin, S. Lee, H. Shin, and J. Kim. 2024. Teach AI How to Code: Using Large Language Models as Teachable Agents for Programming Education. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24). ACM, New York, NY, USA, Honolulu, HI, USA. DOI: 10.1145/3613904.3642349
G. Jošt, V. Taneski, and S. Karakatič. 2024. The Impact of Large Language Models on Programming Education and Student Learning Outcomes. Applied Sciences 14, 4115 (2024). DOI: 10.3390/app14104115
T. Kosar, D. Ostojic, Y. D. Liu, and M. Mernik. 2024. Computer Science Education in ChatGPT Era: Experiences from an Experiment in a Programming Course for Novice Programmers. Mathematics 12, 629 (2024). DOI: 10.3390/math12050629
Vanessa Bielefeldt Leotti, Alan Rodrigues Birck, and João Riboldi. 2005. Comparação dos Testes de Aderência à Normalidade Kolmogorov-smirnov, Anderson-Darling, Cramer–Von Mises e Shapiro-Wilk por Simulação. Anais do 11º Simpósio de Estatística Aplicada à Experimentação Agronômica (2005).
H. Liu, L. Liu, C. Yue, Y. Wang, and B. Deng. 2023. AutoTestGPT: A System for the Automated Generation of Software Test Cases Based on ChatGPT. Available at SSRN 4584792.
R. Liu, C. Zenke, C. Liu, A. Holmes, P. Thornton, and D. Malan. 2024. Teaching CS50 with AI: Leveraging Generative Artificial Intelligence in Computer Science Education. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education (SIGCSE 2024). ACM, New York, NY, USA, Portland, OR, USA. DOI: 10.1145/3626252.3630938
W. Lyu, Y. Wang, T. R. Chung, Y. Sun, and Y. Zhang. 2024. Evaluating the Effectiveness of LLMs in Introductory Computer Science Education: A Semester-Long Field Study. In Proceedings of the Eleventh ACM Conference on Learning @ Scale (L@S ’24). ACM, New York, NY, USA, Atlanta, GA, USA. DOI: 10.1145/3657604.3662036
A. Madaan, A. Shypula, U. Alon, M. Hashemi, P. Ranganathan, Y. Yang, G. Neubig, and A. Yazdanbakhsh. 2023. Learning Performance-Improving Code Edits. arXiv:2302.07867 (2023).
Francis Bento Marques, Yuri Bento Marques, and Benildes Coura Moreira dos Santos Maculan. 2021. Coocorrência de palavras-chave em dados abertos da Capes: teses e dissertações em Ciência da Informação. Múltiplos Olhares em Ciência da Informação (2021).
Antonio Mastropaolo, Luca Pascarella, Emanuela Guglielmi, Matteo Ciniselli, Simone Scalabrino, Rocco Oliveto, and Gabriele Bavota. 2023. On the Robustness of Code Generation Techniques: An Empirical Study on GitHub Copilot. arXiv:2302.00438 [cs.SE]
Maria Fernanda Moura, Bruno Magalhães Nogueira, M da S CONRADO, Fabiano Fernandes dos Santos, and Solange Oliveira Rezende. 2010. Um modelo para a seleção de n-gramas significativos e não redundantes em tarefas de mineração de textos. (2010).
James E. De Muth. 2019. Descriptive Statistics and Univariate Analysis. Springer Science and Business Media LLC, 19–36. DOI: 10.1007/978-3-030-33989-0_2
Anh Nguyen-Duc et al. 2023. Generative Artificial Intelligence for Software Engineering – A Research Agenda. arXiv:2310.18648 [cs.SE] [link]
D. Noever and K. Williams. 2023. Chatbots as Fluent Polyglots: Revisiting Breakthrough Code Snippets. arXiv:2301.03373 [cs] (2023). DOI: 10.48550/arXiv.2301.03373
OpenAI. 2024. Get answers. Find inspiration. Be more productive. [link]
OpenAI. 2024. Prompt engineering best practices for ChatGPT. [link]. Acessado em: 02 de setembro de 2024.
OpenAI. 2024. Prompt Engineering Guide: Write Clear Instructions. [link]. Acesso em: 2 set. 2024.
DanielW. Otter, Julian R. Medina, and Jugal K. Kalita. 2019. A Survey of the Usages of Deep Learning in Natural Language Processing. arXiv:1807.10854 [cs.CL]
Kayal Padmanandam, Sai Priya V. D. S. Bheri, LaxmiHarshika Vegesna, and Kalakuntla Sruthi. 2021. A Speech Recognized DynamicWord Cloud Visualization for Text Summarization. In Proceedings of the International Conference on Inventive Computation Technologies. DOI: 10.1109/ICICT50816.2021.9358693 Accessed via SciSpace.
Artur Parreira, Lúcia Lehmann, and Mariana Oliveira. 2021. O desafio das tecnologias de inteligência artificial na Educação: percepção e avaliação dos professores. Ensaio: avaliação e políticas públicas em educação 29 (2021), 975–999.
O. Petrovska, L. Clift, F. Moller, and R. Pearsall. 2024. Incorporating Generative AI into Software Development Education. In Computing Education Practice (CEP ’24). ACM, New York, NY, USA, Durham, United Kingdom. DOI: 10.1145/3633053.3633057
Roger S Pressman and Bruce R Maxim. 2016. Engenharia de software-8. McGraw Hill Brasil.
M. M. Rahman and Y. Watanobe. 2023. ChatGPT for Education and Research: Opportunities, Threats, and Strategies. Applied Sciences 13, 5783 (2023). DOI: 10.3390/app13095783
Olira Saraiva Rodrigues and Karoline Santos Rodrigues. 2023. A inteligência artificial na educação: os desafios do ChatGPT. Texto Livre 16 (2023), e45997.
Sheldon M. Ross. 2009. Chapter 2 – DESCRIPTIVE STATISTICS. 9–51. DOI: 10.1016/B978-0-12-370483-2.00007-2
S. Russell and P. Norvig. 2022. Inteligência Artificial: Uma Abordagem Moderna (4 ed.). Pearson, São Paulo.
Rafael Antunes dos Santos, Eliseo Berni Reategui, and Sonia Elisa Caregnato. 2022. Análise de coocorrência de palavras na pesquisa brasileira em HIV/AIDS indexada na Web of Science no período 1993-2020. Informação & informação. Londrina, PR. Vol. 27, n. 2 (abr./jun. 2022), p. 248-273 (2022).
IEEE Computer Society. 2014. Guide to the Software Engineering Body of Knowledge (3 ed.). IEEE, Piscataway, NJ, USA.
Ian Sommerville. 2011. Software Engineering, 9/E. Pearson Education India.
Murray R Spiegel. 1993. Estatística, 3ª edição. São Paulo: Makron 1994 (1993).
J. Sun, Q. V. Liao, M. Muller, M. Agarwal, S. Houde, K. Talamadupula, and J. D. Weisz. 2022. Investigating Explainability of Generative AI for Code Through Scenario-Based Design. In 27th International Conference on Intelligent User Interfaces, IUI ’22. Association for Computing Machinery, 212–228. DOI: 10.1145/3490099.3511119
Diego Antonio Rodríguez Torrejón and José Manuel Martín Ramos. 2010. Detección de plagio en documentos. Sistema externo monolingüe de altas prestaciones basado en n-gramas contextuales. Procesamiento del lenguaje natural 45 (2010), 49–57.
Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI EA ’22). Association for Computing Machinery, New York, NY, USA, Article 332, 7 pages. DOI: 10.1145/3491101.3519665
Marco Tulio Valente. 2020. Engenharia de Software Moderna: Princípios e Práticas para Desenvolvimento de Software com Produtividade. Editora: Independente.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention is All You Need. In Advances in Neural Information Processing Systems, Vol. 30. DOI: 10.1016/j.caeai.2023.100147
J.Wang, Y. Huang, C. Chen, Z. Liu, S.Wang, and Q.Wang. 2023. Software Testing with Large Language Model: Survey, Landscape, and Vision. arXiv preprint arXiv:2307.07221 (2023).
M. Waseem, T. Das, A. Ahmad, P. Liang, M. Fahmideh, and T. Mikkonen. 2024. ChatGPT as a Software Development Bot: A Project-based Study. [link]
Jules White et al. 2023. ChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Design. (3 2023). [link]
Jules White et al. 2023. A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. (2 2023). [link]
Jules White et al. 2023. A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arXiv:2302.11382 [cs.SE] [link]
Burak Yetiştiren, Işık Özsoy, Miray Ayerdem, and Eray Tüzün. 2023. Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT. arXiv:2304.10778 [cs.SE]
Z. Yuan, Y. Lou, M. Liu, S. Ding, K. Wang, Y. Chen, and X. Peng. 2023. No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation. arXiv preprint arXiv:2305.04207 (2023).
