An Exploratory Analysis on Sociodemographics Features Importance For a Predictive Undergraduate Computing Students Dropout Model
Resumo
School dropout is a problem faced by educational systems worldwide across various levels of education and institutions. In this regard, several strategies are studied and tested to address this issue or at least mitigate it. With the advancement of artificial intelligence, particularly machine learning, a promising opportunity arises to develop robust predictive models capable of accurately identifying complex patterns and anticipating dropout cases. This study explores the alternatives found by some authors in using machine learning to prevent school dropout, highlighting and comparing aspects of feature engineering adopted and the most relevant characteristics in the training process. Analyzing case studies and recent research, this work demonstrates the most important variables and the ones most chosen among researchers to create machine learning models, suggesting which paths are more efficient and faster for new research.
Palavras-chave:
machine learning, feature engineering, school drop out
Referências
Ahmed, S. A., Billah, M. A., and Khan, S. I. (2021). A machine learning approach to performance and dropout prediction in computer science: Bangladesh perspective. In 16th International Conference on Electronics Computer and Computation, pages 1–6. IEEE.
Alban, M. and Mauricio, D. (2019). Predicting university dropout through data mining: A systematic literature. In Indian Journal of Science and Technology, pages 1–13. Semantic Scholar.
Alvarado-Uribe, J., Mejía-Almada, P., Herrera, A. L. M., Molontay, R., Hilliger, I., Hegde, V., Gallegos, J. E. M., Díaz, R. A. R., and Ceballos, H. G. (2019). Student dataset from tecnologico de monterrey in mexico to predict dropout in higher education. In Data Mining and Computational Intelligence for E-learning and Education, pages 1–17. MDPI.
Angra, S. and Ahuja, S. (2017). Machine learning and its applications: A review. In International Conference on Big Data Analytics and Computational Intelligence, pages 57–60. IEEE.
Awad, M. and Khanna, R. (2015). Efficiente Learing Machines: Theories, Concepts, and Applications for Engineers and System Designers. Apress Open, India, first edition.
Bertocchi, G. and Bozzano, M. (2019). Gender gaps in education. Discussion Paper Series, pages 1–35.
Bonilla-Silva, E. (2021). Racism without Racists: Color-Blind Racism and the Persistence of Racial Inequality in America. Rowman and Littlefield, sixth edition.
Chena, J., Fangb, B., Zhangcand, H., and Xue, X. (2022). A systematic review for mooc dropout prediction from theperspective of machine learning. pages 1–14. Taylor e Francis.
Cohen, K., D., and Hill, H. C. (2021). Learning Policy: When State Education Reform Works. Rowman and Littlefield, yale university press edition.
da Cruz, R. C., Juliano, R. C., Souza, F. C. M., and Souza, A. C. C. (2023). A score approach to identify the risk of students dropout: an experiment with information systems course. In Proceedings of the XIX Brazilian Symposium on Information Systems, pages 1–4. ACM.
da Cruz, R. C., Juliano, R. C., Souza, F. C. M., and Souza, A. C. C. (2024). An exploratory analysis on gender-related dropout students in distance learning higher education using machine learning. In Proceedings of the XIX Brazilian Symposium on Information Systems, pages 1–4. ACM.
da Silva, D. E. M., Pires, E. J. S., Reis, A., de Moura Oliveira, P. B., and Barroso, J. (2022). Forecasting students dropout: A UTAD university study. In Future Internet, pages 1–14. MDPI.
Dasi, H. and Kanakala, S. (2022). Student dropout prediction using machine learning techniques. In International Journal of Intelligent Systems and Applications in Engineering, pages 1–7. IJISAE.
de O. Santos, K. J., Menezes, A. G., de Carvalho, A. B., and Montesco, C. A. E. (2019). Supervised learning in the context of educational data mining to avoid university students dropout. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT), pages 1–2. ACM.
de Oliveira, C. F., Sobral, S. R., Ferreira, M. J., and Moreira, F. (2021). How does learning analytics contribute to prevent students’ dropout in higher education: A systematic literature review. In Big Data and Cognitive Computing, pages 1–33. MDPI.
Dharmawan, T., Ginardi, H., and Munif, A. (2018). Dropout detection using nonacademic data. In 4th International Conference on Science and Technology (ICST), pages 1–4. MDPI.
dos Reis, R. R. (2024). Políticas publicas de combate a evasão escolar no brasil. Orientador: Marcus Vinícius Costa da Conceição.
Freeman, J., Simonsen, B., McCoach, D. B., Sugai, G., Lombardi, A., and Horner, R. (2015). An analysis of the relationship between implementation of school-wide positive behavior interventions and supports and high school dropout rates. In The High School Journal, pages 290–135. The University of North Carolina Press.
Jacobo Roda-Segarra, C. d.-l.-P. and Mengual-Andres, S. (2024). Effectiveness of artificial intelligence models for predicting school dropout: A meta-analysis. Multidisciplinary Journal of Educational Research.
Jailma Januario da Silva, N. T. R. (2021). Predicting dropout in higher education: a systematic review. X Congresso Brasileiro de Informática na Educação, pages 1–11.
Kehm, B. M., Larsen, M. R., and Sommersel, H. B. (2019). Student dropout from universities in europe: A review of empirical literature. In Hungarian Educational Research Journa, pages 1–18. AK Journals.
Khalid Oqaidi, K. M. and Aouhassi, S. (2022). Towards a students’ dropout prediction model in higher education institutions using machine learning algorithms. nternational Journal of Emerging Technologies in Learning, pages 1–16.
Kourkoutas, E. and Hart, A. (2015). Resilience Based Inclusive Models of Students with Social-Emotional and Behavorial Difficulties or Disabilities. Cambridge Scholars Publishing, Reino Unido, first edition.
Maksimova, N., Dunajeva, O., and Pentel, A. (2021). Predicting first-year computer science students drop-out with machine learning methods: A case study. In 2019 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, pages 1–8. IEEE.
Mduma, N., Kalegele, K., and Machuve, D. (2015). A survey of machine learning approaches and techniques for student dropout prediction. Data Science Journal, 18(14):1–10.
Naseem, M., Chaudhary, K., Sharma, B., and Lal, A. G. (2020). Using ensemble decision tree model to predict student dropout in computing science. In Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), pages 1–8. IEEE.
Sandra C. Matz, Christina S. Bukow, H. P. C. D. and Stachl, A. D. . C. (2023). Using machine learning to predict student retention from socio-demographic characteristics and app-based engagement metrics. pages 1–16. Scientific Reports.
Shohag, S. I. and Bakaul, M. (2021). A machine learning approach to detect student dropout at university. In International Journal of Advanced Trends in Computer Science and Engineering, pages 1–8. Warse.
Shynarbek, N., Orynbassar, A., Sapazhanov, Y., and Kadyrov, S. (2021). Prediction of student’s dropout from a university program. In 16th International Conference on Electronics Computer and Computation, pages 1–4. IEEE.
Stein, M., Leitner, M., Trepanier, J. C., and Konsoer, K. (2022). A dataset of dropout rates and other school-level variables in louisiana public high schools. page 10. MDPI.
Utomo, A. P., Purwanto, P., and Surarso, B. (2023). Latest algorithms in machine and deep learning methods to predict retention rates and dropout in higher education: A literature review. In The 8th International Conference on Energy, Environment, Epidemiology and Information System, pages 1–8. EDP Sciences.
Viana, F. S., Santana, A. M., and de Andrade Lira Rabelo, R. (2022). Avaliação de classificadores para predição de evasão no ensino superior utilizando janela semestral. In XI Congresso Brasileiro de Informática na Educação, pages 908–919. SBC.
Yaacob, W., Sobri, M., Nasir, M., Norshahidi, and Husin, W. (2020). Predicting student drop-out in higher institution using data mining techniques. In Journal of Physics: Conference Series, pages 1–14. ICMSCT.
Zheng, A. and Casari, A. (2018). Feature Engineering for Machine Learning: Principles and Techinques for Data Scientists. O’Reilly Media, Estados Unidos da America, first edition.
Alban, M. and Mauricio, D. (2019). Predicting university dropout through data mining: A systematic literature. In Indian Journal of Science and Technology, pages 1–13. Semantic Scholar.
Alvarado-Uribe, J., Mejía-Almada, P., Herrera, A. L. M., Molontay, R., Hilliger, I., Hegde, V., Gallegos, J. E. M., Díaz, R. A. R., and Ceballos, H. G. (2019). Student dataset from tecnologico de monterrey in mexico to predict dropout in higher education. In Data Mining and Computational Intelligence for E-learning and Education, pages 1–17. MDPI.
Angra, S. and Ahuja, S. (2017). Machine learning and its applications: A review. In International Conference on Big Data Analytics and Computational Intelligence, pages 57–60. IEEE.
Awad, M. and Khanna, R. (2015). Efficiente Learing Machines: Theories, Concepts, and Applications for Engineers and System Designers. Apress Open, India, first edition.
Bertocchi, G. and Bozzano, M. (2019). Gender gaps in education. Discussion Paper Series, pages 1–35.
Bonilla-Silva, E. (2021). Racism without Racists: Color-Blind Racism and the Persistence of Racial Inequality in America. Rowman and Littlefield, sixth edition.
Chena, J., Fangb, B., Zhangcand, H., and Xue, X. (2022). A systematic review for mooc dropout prediction from theperspective of machine learning. pages 1–14. Taylor e Francis.
Cohen, K., D., and Hill, H. C. (2021). Learning Policy: When State Education Reform Works. Rowman and Littlefield, yale university press edition.
da Cruz, R. C., Juliano, R. C., Souza, F. C. M., and Souza, A. C. C. (2023). A score approach to identify the risk of students dropout: an experiment with information systems course. In Proceedings of the XIX Brazilian Symposium on Information Systems, pages 1–4. ACM.
da Cruz, R. C., Juliano, R. C., Souza, F. C. M., and Souza, A. C. C. (2024). An exploratory analysis on gender-related dropout students in distance learning higher education using machine learning. In Proceedings of the XIX Brazilian Symposium on Information Systems, pages 1–4. ACM.
da Silva, D. E. M., Pires, E. J. S., Reis, A., de Moura Oliveira, P. B., and Barroso, J. (2022). Forecasting students dropout: A UTAD university study. In Future Internet, pages 1–14. MDPI.
Dasi, H. and Kanakala, S. (2022). Student dropout prediction using machine learning techniques. In International Journal of Intelligent Systems and Applications in Engineering, pages 1–7. IJISAE.
de O. Santos, K. J., Menezes, A. G., de Carvalho, A. B., and Montesco, C. A. E. (2019). Supervised learning in the context of educational data mining to avoid university students dropout. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT), pages 1–2. ACM.
de Oliveira, C. F., Sobral, S. R., Ferreira, M. J., and Moreira, F. (2021). How does learning analytics contribute to prevent students’ dropout in higher education: A systematic literature review. In Big Data and Cognitive Computing, pages 1–33. MDPI.
Dharmawan, T., Ginardi, H., and Munif, A. (2018). Dropout detection using nonacademic data. In 4th International Conference on Science and Technology (ICST), pages 1–4. MDPI.
dos Reis, R. R. (2024). Políticas publicas de combate a evasão escolar no brasil. Orientador: Marcus Vinícius Costa da Conceição.
Freeman, J., Simonsen, B., McCoach, D. B., Sugai, G., Lombardi, A., and Horner, R. (2015). An analysis of the relationship between implementation of school-wide positive behavior interventions and supports and high school dropout rates. In The High School Journal, pages 290–135. The University of North Carolina Press.
Jacobo Roda-Segarra, C. d.-l.-P. and Mengual-Andres, S. (2024). Effectiveness of artificial intelligence models for predicting school dropout: A meta-analysis. Multidisciplinary Journal of Educational Research.
Jailma Januario da Silva, N. T. R. (2021). Predicting dropout in higher education: a systematic review. X Congresso Brasileiro de Informática na Educação, pages 1–11.
Kehm, B. M., Larsen, M. R., and Sommersel, H. B. (2019). Student dropout from universities in europe: A review of empirical literature. In Hungarian Educational Research Journa, pages 1–18. AK Journals.
Khalid Oqaidi, K. M. and Aouhassi, S. (2022). Towards a students’ dropout prediction model in higher education institutions using machine learning algorithms. nternational Journal of Emerging Technologies in Learning, pages 1–16.
Kourkoutas, E. and Hart, A. (2015). Resilience Based Inclusive Models of Students with Social-Emotional and Behavorial Difficulties or Disabilities. Cambridge Scholars Publishing, Reino Unido, first edition.
Maksimova, N., Dunajeva, O., and Pentel, A. (2021). Predicting first-year computer science students drop-out with machine learning methods: A case study. In 2019 IEEE Asia-Pacific Conference on Computer Science and Data Engineering, pages 1–8. IEEE.
Mduma, N., Kalegele, K., and Machuve, D. (2015). A survey of machine learning approaches and techniques for student dropout prediction. Data Science Journal, 18(14):1–10.
Naseem, M., Chaudhary, K., Sharma, B., and Lal, A. G. (2020). Using ensemble decision tree model to predict student dropout in computing science. In Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), pages 1–8. IEEE.
Sandra C. Matz, Christina S. Bukow, H. P. C. D. and Stachl, A. D. . C. (2023). Using machine learning to predict student retention from socio-demographic characteristics and app-based engagement metrics. pages 1–16. Scientific Reports.
Shohag, S. I. and Bakaul, M. (2021). A machine learning approach to detect student dropout at university. In International Journal of Advanced Trends in Computer Science and Engineering, pages 1–8. Warse.
Shynarbek, N., Orynbassar, A., Sapazhanov, Y., and Kadyrov, S. (2021). Prediction of student’s dropout from a university program. In 16th International Conference on Electronics Computer and Computation, pages 1–4. IEEE.
Stein, M., Leitner, M., Trepanier, J. C., and Konsoer, K. (2022). A dataset of dropout rates and other school-level variables in louisiana public high schools. page 10. MDPI.
Utomo, A. P., Purwanto, P., and Surarso, B. (2023). Latest algorithms in machine and deep learning methods to predict retention rates and dropout in higher education: A literature review. In The 8th International Conference on Energy, Environment, Epidemiology and Information System, pages 1–8. EDP Sciences.
Viana, F. S., Santana, A. M., and de Andrade Lira Rabelo, R. (2022). Avaliação de classificadores para predição de evasão no ensino superior utilizando janela semestral. In XI Congresso Brasileiro de Informática na Educação, pages 908–919. SBC.
Yaacob, W., Sobri, M., Nasir, M., Norshahidi, and Husin, W. (2020). Predicting student drop-out in higher institution using data mining techniques. In Journal of Physics: Conference Series, pages 1–14. ICMSCT.
Zheng, A. and Casari, A. (2018). Feature Engineering for Machine Learning: Principles and Techinques for Data Scientists. O’Reilly Media, Estados Unidos da America, first edition.
Publicado
04/11/2024
Como Citar
BALSANELLO, Vitor Gabriel; SOUZA, Alinne Corrêa; SOUZA, Francisco Carlos Monteiro; DAMASCENO, Thiago Cordeiro.
An Exploratory Analysis on Sociodemographics Features Importance For a Predictive Undergraduate Computing Students Dropout Model. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 35. , 2024, Rio de Janeiro/RJ.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 2548-2562.
DOI: https://doi.org/10.5753/sbie.2024.242685.