Estimating the Difficulty of Programming Problems in Brazilian Portuguese with Few Examples: An Evaluation of BERTimbau Embeddings

Resumo


Programming practice is a fundamental step in learning to program. In this context, estimating the difficulty of programming questions is crucial for effective adaptive learning. Some approaches address this problem by analyzing students’ past performance, but this can be impractical as sometimes only the problem text is available. For this reason, this study evaluates the use of semantic embeddings of exercise text (QDET) extracted with BERTimbau, a state-of-the-art BERT model pre-trained for Brazilian Portuguese, in a context where only a small sample is available for training, a task referred to as few-shot learning. The results show that the embeddings generated by BERTimbau do not effectively capture the difficulty of problems based solely on textual information. This emphasizes the need for models trained specifically for this task.

Referências

Baeza-Yates, R. R., Ribeiro-Neto, B., et al. (1999). Modern information retrieval. ACM Press.

Benedetto, L., Aradelli, G., Cremonesi, P., Cappelli, A., Giussani, A., and Turrin, R. (2021). On the application of transformers for estimating the difficulty of multiple-choice questions from text. In Burstein, J., Horbach, A., Kochmar, E., Laarmann-Quante, R., Leacock, C., Madnani, N., Pilán, I., Yannakoudakis, H., and Zesch, T., editors, Proc. of the Workshop on Innovative Use of NLP for Building Educational Applications, pages 147–157.

Benedetto, L., Cremonesi, P., Caines, A., Buttery, P., Cappelli, A., Giussani, A., and Turrin, R. (2023). A Survey on Recent Approaches to Question Difficulty Estimation from Text. ACM Comput. Surv., 55(9).

Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32.

Cawley, G. C. and Talbot, N. L. (2003). Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern recognition, 36(11):2585–2592.

Chen, C.-M., Lee, H.-M., and Chen, Y.-H. (2005). Personalized e-learning system using item response theory. Computers & Education, 44(3):237–255.

da Costa, L. S., Oliveira, I. L., and Fileto, R. (2023). Text classification using embeddings: a survey. Knowledge and Information Systems, 65(7):2761–2803.

de Campos, C. P. and Ferreira, C. E. (2004). Boca: um sistema de apoio a competições de programação. In Workshop de Educação em Computação (WEI).

de Freitas Júnior, H. B., Pereira, F. D., de Oliveira, E. H. T., de Oliveira, D. B. F., and de Carvalho, L. S. G. (2020). Recomendação automática de problemas em juízes online usando processamento de linguagem natural e análise dirigida aos dados. In Simpósio Brasileiro de Informática na Educação (SBIE), pages 1152–1161.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding.

Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In Proc. of the European Conference on Machine Learning (ECML), pages 137–142.

Moreira, J., Silva, C., Santos, A., Ferreira, L., and Reis, J. (2024). Abordagem não-supervisionada para inferência do tópico de um exercício de programação a partir do código solução. In Anais do Workshop sobre Educação em Computação (WEI), pages 842–853.

Parnami, A. and Lee, M. (2022). Learning from few examples: A summary of approaches to few-shot learning. arXiv.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Rodrigues, G., Monteiro, A. F., and Osório, A. (2022). Introductory programming in higher education: A systematic literature review. OASIcs, Volume 102, ICPEC 2022, 102:4:1–4:17.

Salton, G. (1991). Developments in automatic text retrieval. Science, 253(5023):974–980.

Silva, C. E. P., Solano, J. L. S., dos Santos, A. G., and Reis, J. C. S. (2023). Previsão de reprovações em disciplinas introdutórias de programação: Um estudo em um ambiente de correção automática de códigos. In Anais do Simpósio Brasileiro de Informática na Educação (SBIE), pages 1524–1535.

Silva, E. S., Carvalho, L. S., de Oliveira, D. B., Oliveira, E. H., Lauschner, T., de Lima, M. A., and Pereira, F. D. (2022). Previsão de indicadores de dificuldade de questões de programação a partir de métricas do código de solução. In Simpósio Brasileiro de Informática na Educação (SBIE), pages 859–870.

Souza, F., Nogueira, R., and Lotufo, R. (2020). BERTimbau: pretrained BERT models for Brazilian Portuguese. In Brazilian Conference on Intelligent Systems (BRACIS).

Taud, H. and Mas, J.-F. (2017). Multilayer perceptron (mlp). Geomatic approaches for modeling land change scenarios, pages 451–455.

Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Scao, T. L., Gugger, S., Drame, M., Lhoest, Q., and Rush, A. M. (2020). Huggingface’s transformers: State-of-the-art natural language processing. In arXiv.

Zhou, Y. and Tao, C. (2020). Multi-task bert for problem difficulty prediction. In Proc. of the International Conference on Communications, Information System and Computer Engineering (CISCE), pages 213–216.
Publicado
24/11/2025
SENA, João Pedro M.; BARBOSA, Ellen Francine; SANTOS, André G.; REIS, Julio C. S.. Estimating the Difficulty of Programming Problems in Brazilian Portuguese with Few Examples: An Evaluation of BERTimbau Embeddings. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 36. , 2025, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 290-303. DOI: https://doi.org/10.5753/sbie.2025.12253.