A machine learning approach to identify and prioritize college students at risk of dropping out

  • Artur Mesquita Barbosa Universidade Federal do Ceará (UFC)
  • Emanuele Santos Universidade Federal do Ceará (UFC)
  • João Paulo P. Gomes Universidade Federal do Ceará (UFC)

Resumo


In this paper, we present a student dropout prediction strategy based on the classification with reject option paradigm. In such strategy, our method classifies students into dropout prone or non-dropout prone classes and may also reject classifying students when the algorithm does not provide a reliable prediction. The rejected students are the ones that could be classified into either class, and so are probably the ones with more chances of success when subjected to personalized intervention activities. In the proposed method, the reject zone can be adjusted so that the number of rejected students can meet the available workforce of the educational institution. Our method was tested on a dataset collected from 892 undergraduate students from 2005 to 2016.
Palavras-chave: Student dropout, Machine Learning, Prediction, Personalized intervention

Referências

Aparecida, C., Baggi, S., and Lopes, D. A. (2011). EVAÇÃO E AVALIAÇÃO INSTITUCIONAL: UMA DISCUSSÃO BIBLIOGRÁFICA.

Balaniuk, R., Antonio Do Prado, H., Da Veiga Guadagnin, R., Ferneda, E., and Cobbe, P. R. (2011). Predicting evasion candidates in high education institutions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 6918 LNCS, pages 143–151. Springer Berlin Heidelberg.

Brasil (2015). Portaria no 564, de 18 de dezembro de 2015, anexo i. Diário Oficial [da] República Federativa do Brasil. Seção 1, p. 75.

Chow, C. (1970). On optimum recognition error and reject tradeoff. IEEE Transactions on Information Theory, 16(1):41–46.

Costa, F., Silva, A. R. d. S., de Brito, D. M., and do Rego, T. G. (2015). Predição de sucesso de estudantes cotistas utilizando algoritmos de classificação. Anais do Simpósio Brasileiro de Informática na Educação, 26 (Sbie): 997.

d. Oliveira, A. C., Gomes, J. P. P., Neto, A. R. R., and d. Souza, A. H. (2016). Eficiente minimal learning machines with reject option. In 2016 5th Brazilian Conference on Intelligent Systems (BRACIS), pages 397–402.

de Brito, D. M., Júnior, I. A. d. A., Queiroga, E. V., and do Rego, T. G. (2014). Predição de desempenho de alunos do primeiro período baseado nas notas de ingresso utilizando métodos de aprendizagem de máquina. Anais do Simpósio Brasileiro de Informática na Educação, 25(1):882–890.

Fumera, G. and Roli, F. (2002). Support vector machines with embedded reject option. In Proceedings of the 1st International Workshop on Pattern Recognition with Support Vector Machines (SVM’2002), pages 68–82. Springer.

Kantorski, G., Flores, E. G., Schmitt, J., Hoffmann, I., and Barbosa, F. (2016). Predição da Evasão em Cursos de Graduação em Instituições Públicas. Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação - SBIE), 27(1):906.

Lakkaraju, H., Aguiar, E., Shan, C., Miller, D., Bhanpuri, N., Ghani, R., and Addison, K. L. (2015). A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes. KDD’15 Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1909–1918.

Maria, W., Damiani, J. L., and Pereira, M. (2016). Rede Bayesiana para previsão de Evasão Escolar. Anais dos Workshops do Congresso Brasileiro de Informática na Educação, 5(1):920.

Márquez-Vera, C., Cano, A., Romero, C., and Ventura, S. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied Intelligence.

MEC (1995). Comissão especial de estudos sobre a evasão nas universidades públicas brasileiras. ANDIFES/ABRUEM, SESu, MEC, Brasília, DF.

Mesquita, D. P., Rocha, L. S., Gomes, J. P. P., and Neto, A. R. R. (2016). Classification with reject option for software defect prediction. Applied Soft Computing, 49:1085–1093.

Pascoal, T., Brito, D. M. d., Andrade, L., and Rego, T. G. d. (2016). Evasão de estudantes universitários: diagnóstico a partir de dados acadêmicos e socioeconômicos. Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação - SBIE), 27(1):926.

Roberto, H. and Adeodato, P. J. L. (2012). A data mining approach for preventing undergraduate students retention. IEEE World Congress on Computational Intelligence, pages 1–8.

Schmidt, W. F., Kraaijveld, M. A., and Duin, R. P. W. (1992). Feedforward neural networks with random weights. In Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol. II. Conference B: Pattern Recognition Methodology and Systems, pages 1–4.

Silva Filho, R. L. L. e., Motejunas, P. R., Hipólito, O., and Lobo, M. B. D. C. M. (2007). A evasão no ensino superior brasileiro. Cadernos de Pesquisa, 37(132):641–659.

Sousa, R., Mora, B., and Cardoso, J. S. (2009). An ordinal data method for the classification with reject option. In Proceedings of the International Conference on Machine Learning and Applications (ICMLA’09), pages 746–750.

Tamhane, A., Ikbal, S., Sengupta, B., Duggirala, M., and Appleton, J. (2014). Predicting student risks through longitudinal analysis. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD’14, pages 1544–1552.
Publicado
30/10/2017
BARBOSA, Artur Mesquita; SANTOS, Emanuele; GOMES, João Paulo P.. A machine learning approach to identify and prioritize college students at risk of dropping out. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 28. , 2017, Recife/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2017 . p. 1497-1506. DOI: https://doi.org/10.5753/cbie.sbie.2017.1497.