Survival Analysis: a case study in an Information Systems Course

  • Joubert Alexandrino de Souza IFES
  • Karin Satie Komati IFES
  • Jefferson Oliveira Andrade IFES

Abstract


Dropping out of higher education is a serious problem that has been investigated for decades and causes great harm to individuals, educational institutions, and society as a whole. This article presents a case study on the application of the survival analysis technique combined with the construction of predictive models in the identification of the determinant elements of dropout in an undergraduate course in Information Systems at a public institution of higher education in Brazil. Methods of educational data mining and probabilistic modeling were applied to student data to model students' expected completion of the course, semester by semester. The results of the survival analysis indicate the greatest risk of dropout is in the initial semesters of the course, while the identification of the determining characteristics of dropout makes it clear that the subjects of the first two semesters retain about 50% of the student population.

Keywords: Higher education dropout, Educational Data Mining

References

Al Daoud, E. (2019). Comparison between xgboost, lightgbm and catboost using a home credit dataset. International Journal of Computer and Information Engineering, 13(1):6–10.

Ameri, S., Fard, M. J., Chinnam, R. B., and Reddy, C. K. (2016). Survival analysis based framework for early prediction of student dropouts. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pages 903–912, New YorkNYUnited States. ACM.

Bonaldo, L. and Pereira, L. (2016). Dropout: Demographic profile of brazilian university students. Procedia Social and Behavioral Sciences, 228:138–143.

Carminati, G., Augusto, R., Dallabrida, N., and Teive, R. (2020). Mineração de dados educacionais visando a identificação da evasão no ensino superior. In Anais do Computer on the Beach (CoTB 2020), volume 11, pages 461–468.

Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., Chen, K., et al. (2015). Xgboost: extreme gradient boosting. R package version 0.4-2, 1(4):1–4.

Chung, C.-F., Schmidt, P., and Witte, A. D. (1991). Survival analysis: A survey. Journal of Quantitative Criminology, 7(1):59–98.

Costa, F. J. d., Bispo, M. d. S., and Pereira, R. d. C. d. F. (2018). Dropout and retention of undergraduate students in management: a study at a brazilian federal university. RAUSP Management Journal, 53:74–85.

Davidson-Pilon, C. (2019). lifelines: survival analysis in python. Journal of Open Source Software, 4(40):1317.

de Oliveira Júnior, J. G., Noronha, R. V., and Kaestner, C. A. A. (2016). Criação e seleção de atributos aplicados na previsão da evasão de curso em alunos de graduação. In Anais do Computer on the Beach (CoTB 2016), pages 061–070.

Ferreira, J. C. and Patino, C. M. (2016). What is survival analysis, and when should i use it? Jornal Brasileiro de Pneumologia, 42(1):77–77.

Franco, J. J., de Almeida Miranda, F. L., Stiegler, D., Dantas, F. R., Brancher, J. D., and do Carmo Nogueira, T. (2020). Usando mineração de dados para identificar fatores mais importantes do enem dos últimos 22 anos. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 1112–1121. SBC.

Hellas, A., Ihantola, P., Petersen, A., Ajanovski, V. V., Gutica, M., Hynninen, T., Knutas, A., Leinonen, J., Messom, C., and Liao, S. N. (2018). Predicting academic performance: a systematic literature review. In Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, pages 175–199.

Kumar, M., Singh, A., and Handa, D. (2017). Literature survey on educational dropout prediction. International Journal of Education and Management Engineering, 7(2):8.

Nagy, M. and Molontay, R. (2018). Predicting dropout in higher education based on secondary school performance. In 2018 IEEE 22nd international conference on intelligent engineering systems (INES), pages 000389–000394. IEEE.

Olaya, D., Vásquez, J., Maldonado, S., Miranda, J., and Verbeke, W. (2020). Uplift modeling for preventing student dropout in higher education. Decision Support Systems, page 113320.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Saccaro, A., França, M. T. A., and Jacinto, P. d. A. (2019). Fatores associados à evasão no ensino superior brasileiro: um estudo de análise de sobrevivência para os cursos das áreas de ciência, matemática e computação e de engenharia, produção e construção em instituições públicas e privadas. Estudos Econômicos (São Paulo), 49:337–373.

Silva, Filho, R. L. L., Motejunas, P. R., Hipólito, O., and Lobo, M. B. d. C. M. (2007). A evasão no ensino superior brasileiro. Cadernos de pesquisa, 37(132):641–659.

Tinto, V. (1975). Dropout from higher education: A theoretical synthesis of recent research. Review of Educational Research, 45(1):89–125.

Wang, P., Li, Y., and Reddy, C. K. (2019). Machine learning for survival analysis: A survey. ACM Computing Surveys (CSUR), 51(6):1–36.

Zhao, L., Lee, S., and Jeong, S.-P. (2021). Decision tree application to classification problems with boosting algorithm. Electronics, 10(16).
Published
2022-07-31
SOUZA, Joubert Alexandrino de; KOMATI, Karin Satie; ANDRADE, Jefferson Oliveira. Survival Analysis: a case study in an Information Systems Course. In: WORKSHOP ON COMPUTING EDUCATION (WEI), 30. , 2022, Niterói. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 392-403. ISSN 2595-6175. DOI: https://doi.org/10.5753/wei.2022.223357.