Educational data mining: database integration and performance analysis of UFSM students

  • Luís Gustavo Werle Tozevich UFSM
  • Jaime Antonio Daniel Filho UFSM
  • Guilherme Meneghetti Einloft UFSM
  • Tobias Viero de Oliveira UFSM
  • Jefferson Menezes de Oliveira UFSM
  • Joaquim Vinicius Carvalho Assunção UFSM

Abstract


This paper investigates the impact of teacher and institutional variables on the pass rates in introductory mathematics courses at UFSM, using historical data (2021–2023). The integration of internal records and public data allowed the application of mining techniques, using k-means to identify clusters among classes. The evaluation of the clusters through ARI and NMI showed that variability in teaching assessments is the main factor associated with discrepancies in pass rates. Structuring data from different sources, we present key insights that pave the way for studies on educational and pedagogical policies.

References

Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. (1996). From data mining to knowledge discovery in databases. AI Magazine, 17(3):37–54.

Hubert, L. and Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1):193–218.

INEP (2023). Resumo Técnico do Censo da Educação Superior.

Koedinger, K. R., Kim, J., Jia, J., McLaughlin, E., and Bier, N. (2015). Learning is not a spectator sport: Doing is better than watching for learning from a mooc. Proceedings of the Second ACM Conference on Learning @ Scale, pages 111–120.

Romero, C. and Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(3):e1355.

Souza, M. A. d., Machado, P. A. d. S., Santos, L. C. G. d. S., Lima, T. H. d., Rodrigues, J. M., and Diniz, H. A. G. (2025). A influência da qualificação docente no desempenho acadêmico em cursos de engenharia de produção: análise comparativa regional no brasil. Revista de Gestão e Secretariado, 16(1):e4466.

Strehl, A. and Ghosh, J. (2002). Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3:583–617.
Published
2025-04-23
TOZEVICH, Luís Gustavo Werle; DANIEL FILHO, Jaime Antonio; EINLOFT, Guilherme Meneghetti; OLIVEIRA, Tobias Viero de; OLIVEIRA, Jefferson Menezes de; ASSUNÇÃO, Joaquim Vinicius Carvalho. Educational data mining: database integration and performance analysis of UFSM students. In: REGIONAL DATABASE SCHOOL (ERBD), 20. , 2025, Florianópolis/SC. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 173-176. ISSN 2595-413X. DOI: https://doi.org/10.5753/erbd.2025.6848.