Predicting Dropout in Higher Education: a Systematic Review

  • Jailma Januário da Silva Universidade de São Paulo
  • Norton Trevisan Roman Universidade de São Paulo


In this article, we present a systematic literature review, carried out from February to March 2020, on the application of a machine learning technique to predict student dropout in higher education institutions. Besides describing the protocol followed during our research, which includes the research questions, searched databases and query strings, along with criteria for inclusion and exclusion of articles, we also present our main results, in terms of the attributes used by current research on this theme, along with adopted approaches, specific algorithms, and evalution metrics. The Decision Tree technique is the most used for the construction of models, and accuracy and recall and precision being the most used metric for evaluating models.

Palavras-chave: systematic review, dropout analysis, higher education


Ahmed, S. A. and Khan, S. I. (2019). A machine learning approach to predict the engineering students at risk of dropout and factors behind: Bangladesh perspective. In 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pages 1–6.

Baker, R. S. J. d., Isotani, S., and Carvalho, A. M. J. B. d. (2011). Mineração de dados educacionais: Oportunidadespara o brasil. Revista Brasileira de Informática na Educação, pages 1–11.

Biolchini, J., M. P. G. N. A. C. C. and Travassos, G. H. (2005). Systematic review in software engineering. Systems Engineering and Computer Science Department, UFRJ, Rio de Janeiro, pages 1–30.

Davok, D. F. Bernard, R. P. (2016). Avaliação dos índices de evasão nos cursos de graduação da universidade do estado de santa catarina – udesc. pages 503–521.

Gonçalvez, T.C. Silva, J. C. C. O. A. (2018). Técnicas de mineração de dados: um estudo de caso da evasão no ensino superior do instituto federal do maranhão. Revista Brasileira de Computação Aplicada, page 11–20.

Hegde, V. (2016). Dimensionality reduction technique for developing undergraduate student dropout model using principal component analysis through r package. In 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pages 1–6.

Hegde, V. and Prageeth, P. P. (2018). Higher education student dropout prediction and analysis through educational data mining. In 2018 2nd International Conference on Inventive Systems and Control (ICISC), pages 694–699.

Iam-On, N. and Boongoen, T. (2017). Improved student dropout prediction in thai university using ensemble of mixed-type data clusterings. International Journal of Machine Learning and Cybernetics, 8(2):497–510.

Kitchenham, B. (2004). Procedures for performing systematic reviews. Technical Report, Keele University Technical Report TR/SE – 0401, Keele University, Keele, Staffs, UK., pages 1–33.

Mayra, A. and Mauricio, D. (2018). Factors to predict dropout at the universities: A case of study in ecuador. In 2018 IEEE Global Engineering Education Conference (EDUCON), pages 1238–1242.

Nagy, M. and Molontay, R. (2018). Predicting dropout in higher education based on secondary school performance. In 2018 IEEE 22nd International Conference on Intelligent Engineering Systems (INES), pages 389–394.

Perez, B., Castellanos, C., and Correal, D. (2018). Applying data mining techniques to predict student dropout: A case study. 2018 IEEE 1st Colombian Conference on Applications in Computational Intelligence (ColCACI), pages 1–6.

Silveira, R. d. F., Victorino, M. d. C., Holanda, M., and Ladeira, M. (2019). Educational data mining: Analysis of drop out of engineering majors at the unb - brazil. IEEE International Conference on Machine Learning and Applications (ICMLA), pages 1– 4.

Sivakumar, S., Venkataraman, S., and Selvaraj, R. (2016). Predictive modeling of student dropout indicators in educational data mining using improved decision tree. Indian Journal of Science and Technology, pages 1–5.

Solis, M., Moreira, T., Gonzalez, R., Fernandez, T., and Hernandez, M. (2018). Perspectives to predict dropout in university students with machine learning. 2018 IEEE International Work Conference on Bioinspired Intelligence, IWOBI 2018 - Proceedings, pages 1–6.

Srivastava, A., Saini, S., and Gupta, D. (2019). Comparison of various machine learning techniques and its uses in different fields. In 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), pages 81–86.

Timaran Pereira, R. and Caicedo Zambrano, J. (2017). Application of decision trees for detection of student dropout profiles. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 528–531.
SILVA, Jailma Januário da; ROMAN, Norton Trevisan. Predicting Dropout in Higher Education: a Systematic Review. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 32. , 2021, Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 1107-1117. DOI: