Development and Evaluation of an Intelligent Tutor for Programming Learning Based on Extensive Language Models

  • Oleksiy Levchuk CICESE

Abstract


The emergent behavior of automatic programming following the popularization of Generative Artificial Intelligence has raised uncertainty about the future of programming and its teaching. This doctoral work proposes the design and evaluation of an architecture for the development of Intelligent Tutoring Systems for programming learning, integrating Large Language Models to provide a personalized user experience. The architecture is developed using a design-based research methodology, assessing its effect on cognitive engagement and learning through formative prototype evaluations and a summative evaluation conducted via an intervention study.
Keywords: Software Engineering, Large Language Models, Intelligent Tutoring Systems, Generative Artificial Intelligence, Programming Learning, Learning Personalization

References

Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S.,... & Liang, P. (2021). “On the opportunities and risks of foundation models”. arXiv preprint. DOI: 10.48550/arXiv.2108.07258

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P.,... & Amodei, D. (2020). “Language models are few-shot learners”. Advances in neural information processing systems, 33, 1877-1901. DOI: 10.48550/arXiv.2005.14165

Denny, P., Prather, J., Becker, B. A., Finnie-Ansley, J., Hellas, A., Leinonen, J.,... & Sarsa, S. (2024). “Computing education in the era of generative AI”. Communications of the ACM, 67(2), 56-67. DOI: 10.1145/3624720

Du Plooy, E., Casteleijn, D., & Franzsen, D. (2024). “Personalized adaptive learning in higher education: a scoping review of key characteristics and impact on academic performance and engagement”. Heliyon, 10(21), e39630. DOI: 10.1016/j.heliyon.2024.e39630

Fernández, L. R., Mena, A. L. F., Magaña, M. P. T., Magaña, M. A. R., & Fernández, M. A. R. (2024). “Inteligencia artificial en la educación: Modelo de lenguaje de gran tamaño (LLM) como recurso educativo”. Revista IPSUMTEC, 7(2), 157-164. DOI: 10.61117/ipsumtec.v7i2.321

Gao, L., Lu, J., Shao, Z., Lin, Z., Yue, S., Ieong, C.,... & Chen, S. (2024). “Fine-tuned large language model for visualization system: A study on self-regulated learning in education”. IEEE Transactions on Visualization and Computer Graphics. DOI: 10.1109/TVCG.2024.3456145

Goodfellow, I., Bengio, Y., & Courville, A. (2016). “Deep learning”. MIT Press.

Halverson, L. R., & Graham, C. R. (2019). “Learner Engagement in Blended Learning Environments: A Conceptual Framework”. Online Learning, 23, 145-178. DOI: 10.24059/olj.v23i2.1481

Johannesson, P., & Perjons, E. (2021). “An Introduction to Design Science”. In Springer eBooks. DOI: 10.1007/978-3-030-78132-3

Khan, H., Gul, R., & Zeb, M. (2023). “The Effect of Students’ Cognitive and Emotional Engagement on Students’ Academic Success and Academic Productivity”. Journal Of Social Sciences Review, 3(1), 322-334. DOI: 10.54183/jssr.v3i1.141

Lange, C. (2021). “The relationship between e-learning personalization and cognitive load”. Open Learning the Journal of Open Distance And e-Learning, 38(3), 228-242. DOI: 10.1080/02680513.2021.2019577

Levchuk, O. (2024). Diseño y evaluación de un tutor inteligente basado en Inteligencia Artificial Generativa para la adquisición de habilidades de programación. Tesis de Maestría en Ciencias. CICESE, Baja California, México. 92 pp.

Levchuk, O., Sánchez, C., Pacheco, N., López, I., & Favela, J. (2024). “Interaction Design (IxD) of an Intelligent Tutor for Programming Learning Based on LLM”. Avances en Interacción Humano-Computadora, 9(1), 1–10. DOI: 10.47756/aihc.y9i1.137

Liu, Z., He, X., Liu, L., Liu, T., & Zhai, X. (2023). “Context matters: A strategy to pre-train language model for science education”. In International Conference on Artificial Intelligence in Education, 666-674. Cham: Springer Nature Switzerland. DOI: 10.1007/978-3-031-36336-8_103

Qureshi, B. (2023). “Exploring the use of chatgpt as a tool for learning and assessment in undergraduate computer science curriculum: Opportunities and challenges”. ArXiv preprint. DOI: 10.48550/arXiv.2304.11214

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D. & Sutskever, I. (2019). “Language Models are Unsupervised Multitask Learners”. OpenAI.

Rahman, M. M., & Watanobe, Y. (2023). “ChatGPT for Education and Research: Opportunities, Threats, and Strategies”. Applied Sciences, 13(9), 5783. DOI: 10.3390/app13095783

Scherer, R., Siddiq, F., & Viveros, B. S. (2020). “A meta-analysis of teaching and learning computer programming: Effective instructional approaches and conditions”. Computers In Human Behavior, 109, 106349. DOI: 10.1016/j.chb.2020.106349

Schmucker, R., Xia, M., Azaria, A., & Mitchell, T. (2023). “Ruffle&riley: Towards the automated induction of conversational tutoring systems”. ArXiv preprint. DOI: 10.48550/arXiv.2310.01420

Singh, D., & Rajendran, R. (2024). “Cognitive engagement as a predictor of learning gain in Python programming”. Smart Learning Environments, 11(1). DOI: 10.1186/s40561-024-00330-9

Sonkar, S., Ni, K., Chaudhary, S., & Baraniuk, R. G. (2024). “Pedagogical alignment of large language models”. arXiv preprint. DOI: 10.48550/arXiv.2402.05000

Tamkin, A., Liu, K., Valle, R., & Clark, J. (2025). “Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations”. Anthropic. assets.anthropic.com/m/2e23255f1e84ca97/original/Economic_Tasks_AI_Paper.pdf

Vaithilingam, P., Zhang, T., & Glassman, E. L. (2022). “Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models”. CHI EA '22: CHI Conference on Human Factors in Computing Systems Extended Abstracts, Article 332, 1–7. DOI: 10.1145/3491101.3519665

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). “Attention is all you need”. Advances in Neural Information Processing Systems, 30, 5998–6008. DOI: 10.48550/arXiv.1706.03762

Zhai, X., & Wiebe, E. (2023). “Technology-based innovative assessment”. In Classroom-Based STEM Assessment: Contemporary Issues and Perspectives, 99–125.
Published
2025-05-12
LEVCHUK, Oleksiy. Development and Evaluation of an Intelligent Tutor for Programming Learning Based on Extensive Language Models. In: IBERO-AMERICAN CONFERENCE ON SOFTWARE ENGINEERING (CIBSE), 28. , 2025, Ciudad Real/Espanha. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 273-280. DOI: https://doi.org/10.5753/cibse.2025.35313.