CoderBot 2.0: Integrating LLM and Prompt Engineering in the Evolution of an Educational Chatbot

Andre Mendes; Arthur Parizotto; Renato Garcia; Marcelino Garcia; Gilleanes Guedes; Ricardo Vilela; Pedro Valle; Renato Balancieri; Williamson Silva

doi:10.5753/sbie.2025.12321

Andre Mendes UNIPAMPA
Arthur Parizotto UNIPAMPA
Renato Garcia USP
Marcelino Garcia UEM
Gilleanes Guedes UNIPAMPA
Ricardo Vilela Unicamp
Pedro Valle USP
Renato Balancieri UEM
Williamson Silva UFCA / UNIPAMPA

DOI: https://doi.org/10.5753/sbie.2025.12321

Resumo

Teaching programming to beginner students remains a significant challenge, as it requires the development of complex cognitive skills. Given this, this article presents CoderBot 2.0, an evolution of a pedagogical agent designed for introductory programming instruction based on Example-Based Learning (EBL). The new version integrates Large Language Models (LLMs) and prompt engineering techniques to generate both correct and erroneous examples on demand, as well as code explanations and progressive feedback, without relying on fine-tuning. Aligned with Cognitive Load Theory, its architecture promotes metacognitive strategies such as self-explanation, fading, and contextual variation. The system offers adaptive support, adjusting to the student’s level and encouraging engagement, autonomy, and reflection.

Referências

Adams, D. M., McLaren, B. M., Durkin, K., Mayer, R. E., Rittle-Johnson, B., Isotani, S., and Van Velsen, M. (2014). Using erroneous examples to improve mathematics learning with a web-based tutoring system. Computers in Human Behavior, 36:401–411.

Anderson, J. R., Corbett, A. T., Koedinger, K. R., and Pelletier, R. (1995). Cognitive tutors: Lessons learned. In The Journal of the Learning Sciences, volume 4, pages 167–207.

Atkinson, R. K., Derry, S. J., Renkl, A., and Wortham, D. (2000). Learning from examples: Instructional principles from the worked examples research. Review of Educational Research, 70(2):181–214.

Baddeley, A. (1992). Working memory. Science, 255(5044):556–559.

Barros, T., Macedo, L., and Mendes, A. (2021). Automatic generation of worked examples for programming tutors. In International Conference on Artificial Intelligence in Education, pages 48–60. Springer.

Beege, M., Schneider, S., Nebel, S., Zimm, J., Windisch, S., and Rey, G. D. (2021). Learning programming from erroneous worked-examples. which type of error is beneficial for learning? Learning and Instruction, 75:101497.

Chiarelli, V., Lonati, V., Malacaria, A., Monti, M., and Taverna, A. (2022). A review of worked examples in programming activities. ACM Transactions on Computing Education (TOCE), 23(1):1–28.

Dai, W., Lin, J., Jin, F., and et al. (2023). Can generative artificial intelligence outperform self-instructional learning in computer programming? IEEE Transactions on Learning Technologies, 16(6):918–929.

Fakour, R., Shahnazari, M., and Alemi, M. (2025). The effectiveness of socratic questioning and concept mapping techniques in developing university students’ critical thinking skills. Teaching in Higher Education, 30(1):1–20.

Garcia, R. D. S., Villa, J. E. A., Miranda, A. L. M., Guedes, G. T. A., Oran, A. C., De Souza, P. S. S., Vilela, R. F., Valle, P. H. D., and Silva, W. (2025). Theory inspires, but examples engage: A mixed-methods analysis of worked examples from coderbot in programming education. IEEE Access.

Hattie, J. and Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1):81–112.

Hundhausen, C. D., Douglas, S. A., and Stasko, J. T. (2002). A meta-study of algorithm visualization effectiveness. Journal of Visual Languages & Computing, 13(3):259–290.

Hutchins, E. (1995). Cognition in the Wild. MIT Press, Cambridge, MA.

Jury, B., Lorusso, A., Leinonen, J., Denny, P., and Luxton-Reilly, A. (2024). Evaluating llmgenerated worked examples in an introductory programming course. In Proceedings of the 26th Australasian computing education conference, pages 77–86.

Keuning, H., Jeuring, J., and Heeren, B. (2018). A systematic literature review of automated feedback generation for programming exercises. ACM Transactions on Computing Education, 19(1):1–43.

Kopp, V., Stark, R., and Fischer, M. R. (2008). Fostering diagnostic knowledge through computer-supported, case-based worked examples: effects of erroneous examples and feedback. Medical education, 42(8):823–829.

Lave, J. and Wenger, E. (1991). Situated Learning: Legitimate Peripheral Participation. Cambridge University Press, Cambridge, UK.

Li, Y., Chen, J., and Wang, X. (2023). Prompt engineering for large language models: A survey.

Mendes, A., Garcia, R., Villa, J., Oran, A., Santana, B. S., Guedes, G. T., Silva, D. G., Valle, P., and Silva, W. (2024). Avaliando a autoeficácia e a aceitação do coderbot em cursos introdutórios de programação: um estudo exploratório. In Simpósio Brasileiro de Informática na Educação (SBIE), pages 3264–3273. SBC.

Pirzado, F. A., Ahmed, A., Mendoza-Urdiales, R. A., and Terashima-Marin, H. (2024). Navigating the pitfalls: Analyzing the behavior of llms as a coding assistant for computer science students-a systematic review of the literature. IEEE Access.

Puech, R., Macina, J., Chatain, J., Sachan, M., and Kapur, M. (2025). Towards the pedagogical steering of large language models for tutoring: A case study with modeling productive failure.

Renkl, A. (2014a). Toward an instructionally oriented theory of example-based learning. Cognitive science, 38(1):1–37.

Renkl, A. (2014b). The worked examples principle in multimedia learning.

Renkl, A. (2017). Learning from worked-examples in mathematics: Students relate procedures to principles. ZDM, 49(4):571–584.

Robins, A., Rountree, J., and Rountree, N. (2003). Learning and teaching programming: A review and discussion. Computer Science Education, 13(2):137–172.

Sajja, G. S. and Ramesh, S. (2023). Artificial intelligence-enabled intelligent assistant for personalized and adaptive learning in higher education. arXiv preprint arXiv:2309.10892.

Sands, P. (2019). Addressing cognitive load in the computer science classroom. In The Journal of Computing Sciences in Colleges, volume 34, pages 55–62. Consortium for Computing Sciences in Colleges.

Shaffer, D., Doubé, W., and Tuovinen, J. (2003). Applying cognitive load theory to computer science education. Annual Workshop of the Psychology of Programming Interest Group.

Sweller, J., Ayres, P., and Kalyuga, S. (2011). Cognitive Load Theory. Springer.

Sychev, O., Anikin, A., Denisov, M., and et al. (2021). Improving automated program repair using question asking tutoring strategy. In International Conference on Artificial Intelligence in Education, pages 301–307. Springer.

Villa, J. E. A., Garcia, R., Miranda, A. L., Oran, A., Guedes, G. T., Santana, B. S., Silva, D. G., Valle, P., and Silva, W. (2024). Perspectiva dos estudantes sobre um agente pedagógico baseado em exemplos para a aprendizagem de programação: uma análise qualitativa. In Simpósio Brasileiro de Informática na Educação (SBIE), pages 459–473. SBC.

Woolf, B. P. (2009). Building Intelligent Interactive Tutors: Student-Centered Strategies for Revolutionizing E-Learning. Morgan Kaufmann.

Zhang, Y., Feng, X., Zhang, J., and et al. (2024). Spl: A socratic playground for learning powered by large language models. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education, pages 1381–1387.