Is it Possible to Predict Dropout Based Only on Academic Performance?

Abstract


One of the major problems of high education in Brazil is the elevated dropout rate of students. In this work, we apply Data Mining techniques, more specifically, classification techniques, to predict and try to avoid dropouts. The predictive models are generated based only on the performance of the students in the subjects taken. Also, n different models are created, from which the i-th model, 1 =< i <= n, is capable of predicting, at the end of a student's i-th semester, whether he or she will drop out or graduate in the future. The experiments conducted with a real database with data from students of a Brazilian university showed that the models are capable of achieving predictive accuracy between 79.31% and 98.25%.

Keywords: college droupout prediction, data mining, decision tree, random forest

References

Alban, M. and Mauricio, D. (2019). Predicting university dropout through data mining: A systematic literature. Indian Journal of Science and Technology, 12(4):1–12.

Aulck, L. S., Nambi, D., Velagapudi, N., Blumenstock, J. E., and West, J. D. (2019). Mining university registrar records to predict first-year undergraduate attrition. In Proceedings of the 12th International Conference on Educational Data Mining.

Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32.

Carrano, D., Tuler, E., Infante, C., and Rocha, L. (2019). Combinando técnicas de mineração de dados para melhorar a detecção de indicadores de evasão universitária. In Anais do XXX Simpósio Brasileiro de Informática na Educação.

da Silva, D. R., de Lima Martins, S., and Maciel, C. (2017). Identification and systematization of indicatives and data mining techniques for detecting evasion in distance education. In Proceedings of Twelfth Latin-American Conference on Learning Technologies (LACLO).

da Silva, H. F. D. and Marques, W. (2017). Evasão na educação superior no brasil: Desafio à gestão acadêmica. Quaestio, 19(1):197–208.

de Brito, D. M., de Almeida Júnior, I. A., Queiroga, E. V., and do Rêgo, T. G. (2014). Predição de desempenho de alunos do primeiro período baseado nas notas de ingresso utilizando métodos de aprendizagem de máquina. In Anais do XXV Simpósio Brasileiro de Informática na Educação.

Hess, F. (2018). The college dropout problem. Forbes.

J. Han, M. K. and Pei, J. (2012). Data Mining: Concepts and Techniques. Morgan Kaufmann, third edition.

Moissa, B., Gasparini, I., and Kemczinski, A. (2015). Educational data mining versus learning analytics: estamos reinventando a roda? um mapeamento sistemático. In Anais do XXVI Simpósio Brasileiro de Informática na Educação.

Rai, S. and Jain, A. K. (2013). Students’ dropout risk assessment in undergraduate courses of ict at residential university - a case study. International Journal of Computer Applications, 84(14):31–36.

Romero, C. and Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. WIREs Data Mining and Knowledge Discovery, 10(3):e1355.

Sales, A., Balby, L., and Cajueiro, A. (2016). Exploiting academic records for predicting student drop out: A case study in brazilian higher education. Journal of Information and Data Management, 7(2):166–166.

Santos, G. A. S., Bordignon, A. L., Oliveira, S. L. G., Haddad, D. B.,Brandão, D. N., and Belloze, K. T. (2018). A brief review about educational data mining applied to predict student’s dropout. In Anais da V Escola Regional de Sistemas de Informação do Rio de Janeiro, pages 86–91. SBC.
Published
2021-11-22
SANTOS, Carlos Henrique D. C.; MARTINS, Simone de Lima; PLASTINO, Alexandre. Is it Possible to Predict Dropout Based Only on Academic Performance?. In: BRAZILIAN SYMPOSIUM ON COMPUTERS IN EDUCATION (SBIE), 32. , 2021, Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 792-802. DOI: https://doi.org/10.5753/sbie.2021.218105.