Educational data mining to support identification and prevention of academic retention and dropout: a case study in introductory programming

Autores

DOI:

https://doi.org/10.5753/rbie.2022.2518

Palavras-chave:

Educational data mining, Higher education, Retention prevention, Dropout prevention, Academic analytics, Machine learning, Data classification

Resumo

Several works in the literature emphasized data mining as efficient tools to identify factors related to retention and dropout in higher education. However, most of these works do not discuss if (or how) such factors may effectively contribute to decrease such rates. This article presents a data mining approach conceived to identify students at retention risk in a course of Intro to Computer Programming as well as guide preventive interventions to help such students to overcome this situation. Our results indicated an averaged predictive performance superior to 80% in both accuracy and F1 when identifying factors related to the retention. Moreover, during the two years of the project execution, the annual success rates in the course were the highest in comparison to the last five years.

Downloads

Não há dados estatísticos.

Referências

Adekitan, A. I., & Salau, O. (2019). The impact of engineering students’ performance in the first three years on their graduation result using educational data mining. Heliyon, 5(2), e01250. doi: 10.1016/j.heliyon.2019.e01250. [GS Search]

Arendale, D. R. (1994). Understanding the supplemental instruction model. In D. C. Martin & D. R. Arendale (Eds.), Supplemental Instruction: Increasing student achievement and retention. (New Directions in Teaching and Learning, No. 60, pp. 11-21). San Francisco: Jossey-Bass. doi: 10.1002/tl.37219946004. [GS Search]

Baker, R., Isotani, S., & Carvalho, A. (2011). Mineração de dados educacionais: Oportunidades para o Brasil. Revista Brasileira de Informática na Educação, 19(02), 03–13. doi: 10.5753/rbie.2011.19.02.03. [GS Search]

Beltran, C. A. R., Xavier-Júnior, J. C., Barreto, C. A., & Oliveira Neto, C. (2019). Plataforma de aprendizado de máquina para detecção e monitoramento de alunos com risco de evasão. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE) (Vol. 30, pp. 1591–1600). doi: 10.5753/cbie.sbie.2019.1591. [GS Search]

Berger, R. (2019). Dropouts thoughts on whether having a mentor would have helped them remain in school. Open Access Library Journal, 6. doi: 10.4236/oalib.1105718. [GS Search]

Cambruzzi, W. L., Rigo, S. J., & Barbosa, J. L. (2015). Dropout prediction and reduction in distance education courses with the learning analytics multitrail approach. Journal of Universal Computer Science, 21(1), 23–47. doi: 10.3217/jucs-021-01-0023. [GS Search]

Carrano, D., de Albergaria, E. T., Infante, C., & Rocha, L. (2019). Combinando técnicas de mineração de dados para melhorar a detecção de indicadores de evasão universitária. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE) (Vol. 30, pp. 1321–1330). doi: 10.5753/cbie.sbie.2019.1321. [GS Search]

Cohen, W. W. (1995). Fast effective rule induction. In Machine learning proceedings 1995 (pp. 115–123). Elsevier. doi: 10.1016/B978-1-55860-377-6.50023-2. [GS Search]

Damasceno, I. L., & Carneiro, M. G. (2018). Panorama da evasão no curso de sistemas de informação da Universidade Federal de Uberlândia: Um estudo preliminar. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE) (Vol. 29, pp. 1766–1770). doi: 10.5753/cbie.sbie.2018.1766. [GS Search]

Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation, 10(7), 1895–1923. doi: 10.1162/089976698300017197. [GS Search]

Frank, E., Hall, M. A., & Witten, I. H. (2016). The WEKA workbench. In Data mining: Practical machine learning tools and techniques (4th ed., chap. Online Appendix). Burlington, MA: Morgan Kaufmann. [GS Search]

Friedman, D. B., Yelton, B., Corwin, S. J., Hardin, J. W., Ingram, L. A., Torres-McGehee, T. M., & Alberg, A. J. (2021). Value of peer mentorship for equity in higher education leadership: a school of public health focus with implications for all academic administrators. Mentoring & Tutoring: Partnership in Learning, 29(5), 500–521. doi: 10.1080/13611267.2021.1986795. [GS Search]

García-Peña, M., Arciniegas-Alarcón, S., & Barbin, D. (2014). Climate data imputation using the singular value decomposition: an empirical comparison. Revista Brasileira de Meteorologia, 29(4), 527–536. doi: 10.1590/0102-778620130005. [GS Search]

Gottardo, E., Kaestner, C., & Noronha, R. V. (2012). Previsão de desempenho de estudantes em cursos EAD utilizando mineração de dados: uma estratégia baseada em séries temporais. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE) (Vol. 23). [GS Search]

Hawlitschek, A., Köppen, V., Dietrich, A., & Zug, S. (2019). Drop-out in programming courses – prediction and prevention. Journal of Applied Research in Higher Education, 12(1). doi: 10.1108/JARHE-02-2019-0035. [GS Search]

Horton, D., & Craig, M. (2015). Drop, fail, pass, continue: Persistence in CS1 and beyond in traditional and inverted delivery. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education (p. 235–240). New York, NY, USA: Association for Computing Machinery. doi: 10.1145/2676723.2677273. [GS Search]

Jayaprakash, S. M., Moody, E. W., Lauría, E. J., Regan, J. R., & Baron, J. D. (2014). Early alert of academically at-risk students: An open source analytics initiative. Journal of Learning Analytics, 1(1), 6–47. doi: 10.18608/jla.2014.11.3. [GS Search]

Kampff, A. J. C., Ferreira, V. H., Reategui, E. B., & Lima, J. V. d. (2014). Identificação de perfis de evasão e mau desempenho para geração de alertas num contexto de educação a distância. RELATEC: Revista Latinoamericana de Tecnología Educativa, 13(2), 61–76. [GS Search]

Kotsiantis, S. B., Pierrakeas, C., & Pintelas, P. E. (2003). Preventing student dropout in distance learning using machine learning techniques. In International conference on knowledge-based and intelligent information and engineering systems (pp. 267–274). doi: 10.1007/978-3-540-45226-3_37. [GS Search]

Li, H., Ding, W., & Liu, Z. (2020). Identifying at-risk K-12 students in multimodal online environments: a machine learning approach. In Proceedings of the 13th international conference on educational data mining (edm 2020). [GS Search]

Manhães, L. M. B., Da Cruz, S. M. S., Costa, R. J. M., Zavaleta, J., & Zimbrão, G. (2012). Previsão de estudantes com risco de evasão utilizando técnicas de mineração de dados. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE) (Vol. 1). [GS Search]

Moschetti, R. V., Plunkett, S. W., Efrat, R., & Yomtov, D. (2018). Peer mentoring as social capital for latina/o college students at a hispanic-serving institution. Journal of Hispanic Higher Education, 17(4), 375–392. doi: 10.1177/1538192717702949. [GS Search]

Neves, F., Campos, F., Dantas, M., David, J. M., Braga, R., & Stroele, V. (2021). Uso de aprendizado de máquina para detecção de risco de evasão no curso de licenciatura em computação. Lynx, 1(2). [GS Search]

Noetzold, E., & de L. Pertile, S. (2021). Análise e predição de evasão dos alunos de um curso de graduação em sistemas de informação por meio da mineração de dados educacionais. RENOTE - Revista Novas Tecnologias na Educação, 19(1). doi: 10.22456/1679-1916.118525. [GS Search]

Oliveira, J. L., Ambrósio, A. P., Silva, U., Brancher, J., & Franco, J. J. (2020). Undergraduate students’ effectiveness in an institution with high dropout index. In 2020 IEEE Frontiers in Education Conference (FIE) (p. 1-7). doi: 10.1109/FIE44824.2020.9274108. [GS Search]

Palacios, C. A., Reyes-Suárez, J. A., Bearzotti, L. A., Leiva, V., & Marchant, C. (2021). Knowledge discovery for higher education student retention based on data mining: Machine learning algorithms and case study in chile. Entropy, 23(4). doi: 10.3390/e23040485. [GS Search]

Pappas, I. O., Giannakos, M. N., & Jaccheri, L. (2016). Investigating factors influencing students’ intention to dropout computer science studies. In Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education (p. 198–203). New York, NY, USA: Association for Computing Machinery. doi: 10.1145/2899415.2899455. [GS Search]

Petersen, A., Craig, M., Campbell, J., & Tafliovich, A. (2016). Revisiting why students drop CS1. In Proceedings of the 16th Koli Calling International Conference on Computing Education Research (p. 71–80). New York, NY, USA: Association for Computing Machinery. doi: 10.1145/2999541.2999552. [GS Search]

Pimentel, T., Passos, C., Fernandes, I., & Goldschmidt, R. (2019). Mineração de padrões sequenciais de sentimentos: Um estudo de caso na detecção de propensão à evasão escolar na educação superior. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE) (Vol. 30, pp. 1411–1420). doi: 10.5753/cbie.sbie.2019.1411. [GS Search]

Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(3), e1355. doi: 10.1002/widm.1355. [GS Search]

Salloum, S. A., Alshurideh, M., Elnagar, A., & Shaalan, K. (2020). Mining in educational data: Review and future directions. In Aicv (pp. 92–102). doi: 10.1007/978-3-030-44289-7_9. [GS Search]

Santos, K. J. O., Menezes, A. G., de Carvalho, A. B., & Montesco, C. A. E. (2019). Supervised learning in the context of educational data mining to avoid university students dropout. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT) (Vol. 2161-377X, p. 207-208). doi: 10.1109/ICALT.2019.00068. [GS Search]

Santos, R., Pitangui, C., Vivas, A., & Assis, L. (2016). Análise de trabalhos sobre a aplicaçao de técnicas de mineraçao de dados educacionais na previsao de desempenho acadêmico. In Anais dos Workshops do Congresso Brasileiro de Informática na Educação (Vol. 5, pp. 960–969). doi: 10.5753/cbie.wcbie.2016.960. [GS Search]

Santos, V., Saraiva, D., & Oliveira, C. (2021). Uma análise de trabalhos de mineração de dados educacionais no contexto da evasão escolar. In Anais do XXXII Simpósio Brasileiro de Informática na Educação (pp. 1196–1210). Porto Alegre, RS, Brasil: SBC. doi: 10.5753/sbie.2021.218167. [GS Search]

Seidman, A. (2012). Taking action. A Retention Formula and Model for Student Success. In A. Seidman (Ed.), College Student Retention. Formula for Student Success, 267–284. [GS Search]

Silva, R., Borges, B., Ferreira, M. d. F., Santos, I., & Andrade, R. (2021). Evasão em computação na UFC sob a perspectiva dos alunos. In Anais do XXIX Workshop sobre Educação em Computação (pp. 338–347). Porto Alegre, RS, Brasil: SBC. doi: 10.5753/wei.2021.15925. [GS Search]

Tinto, V. (1993). Leaving college: Rethinking the causes and cures of student attrition. ERIC. [GS Search]

Yu, C. H., DiGangi, S., Jannasch-Pennell, A., & Kaprolet, C. (2010). A data mining approach for identifying predictors of student retention from sophomore to junior year. Journal of Data Science, 8(2), 307–325. doi: 10.6339/JDS.2010.08(2).574. [GS Search]

Arquivos adicionais

Publicado

2022-09-27

Como Citar

CARNEIRO, M. G.; DUTRA, B. L.; PAIVA, J. G. S.; GABRIEL, P. H. R.; ARAÚJO, R. D. Educational data mining to support identification and prevention of academic retention and dropout: a case study in introductory programming. Revista Brasileira de Informática na Educação, [S. l.], v. 30, p. 379–395, 2022. DOI: 10.5753/rbie.2022.2518. Disponível em: https://sol.sbc.org.br/journals/index.php/rbie/article/view/2518. Acesso em: 29 mar. 2024.

Edição

Seção

Artigos