Investigating Artificial Intelligence Algorithms to Predict College Students’ Academic Performance: A Systematic Mapping Study

  • Henrique S. Rodrigues UNIRIO
  • Laura O. Moraes UNIRIO
  • Ana Cristina Bicharra Garcia UNIRIO
  • Reinaldo Viana Alvares UNIRIO
  • Rodrigo Pereira dos Santos UNIRIO
  • Carla Delgado UFRJ

Resumo


The aim of this study is to analyse Artificial Intelligence (AI) algorithms in order to predict and identify Higher Education Institutions’ (HEI) students’ academic performance. A systematic mapping study (SMS) was carried out to investigate the limitations that researchers face when using AI algorithms to predict student academic performance and to determine which combinations of variables-algorithm yield the best results. A set of 43 studies was selected to be analysed. We found that the most commonly used variables to predict students’ academic performance can be grouped into socioeconomic (gender, age, and professional position), previous academic performance (grades, GPA, frequency, and scores at entrance exams), internet activity (use of LMS systems, e.g., Moodle or Google Classroom, and use of social media), and psychological and health (quality of sleep, eating habits, social life, academic or professional workload). The results show that the most common algorithm with the best evaluations to predict academic performance is the Random Forest at the time of publishing this study, the most common limitation faced by researchers is related to few available data, and the most common data used in these algorithms is related to previous academic performance.

Referências

Abdul Bujang, S. D., Selamat, A., Krejcar, O., Mohamed, F., Lim, K., Po Chan, C., and Fujita, H. (2022). Imbalanced classification methods for student grade prediction: A systematic literature review. IEEE Access, PP:1–1.

Alhazmi, E. and Sheneamer, A. (2023). Early predicting of students performance in higher education. IEEE Access, 11:27579–27589.

Arun, D. K., Namratha, V., Ramyashree, B. V., Jain, Y. P., and Roy Choudhury, A. (2021). Student academic performance prediction using educational data mining. In 2021 International Conference on Computer Communication and Informatics (ICCCI), pages 1–9.

Asthana, P., Mishra, S., Gupta, N., Derawi, M., and Kumar, A. (2023). Prediction of student’s performance with learning coefficients using regression based machine learning models. IEEE Access, 11:72732–72742.

Baker, R., Isotani, S., and Carvalho, A. (2011). Mineraçao de dados educacionais: Oportunidades para o brasil. Revista Brasileira de informática na educação, 19(02):03.

Barik, L., Barukab, O., and Abdullah, A. A. (2020). Employing artificial intelligence techniques for student performance evaluation and teaching strategy enrichment: An innovative approach. International Journal of ADVANCED AND APPLIED SCIENCES, 7:10–24.

Biasi, V., De Vincenzo, C., and Patrizi, N. (2018). Cognitive strategies, motivation to learning, levels of wellbeing and risk of drop-out: An empirical longitudinal study for qualifying ongoing university guidance services. Journal of Educational and Social Research, 8(2):79–91.

Bonaldo, L. and Pereira, L. N. (2016). Dropout: Demographic profile of brazilian university students. Procedia - Social and Behavioral Sciences, 228:138–143. 2nd International Conference on Higher Education Advances,HEAd’16, 21-23 June 2016, València, Spain.

Borhani, K. and Wong, R. T. (2023). An artificial neural network for exploring the relationship between learning activities and students’ performance. Decision Analytics Journal, 9:100332.

Brasil (2018). Lei nº 13.709, de 14 de agosto de 2018. Diário Oficial da República Federativa do Brasil.

Brasil (2023). Censo da educação superior. [link]. Acessed in 10/30/2024.

Breiman, L. (2001). Random Forests. Machine Learning, 45(1):5–32.

Bujang, S. D. A., Selamat, A., Ibrahim, R., Krejcar, O., Herrera-Viedma, E., Fujita, H., and Ghani, N. A. M. (2021). Multiclass prediction model for student grade prediction using machine learning. IEEE Access, 9:95608–95621.

Butcher, D. F. and Muth, W. A. (1985). Predicting performance in an introductory computer science course. Commun. ACM, 28(3):263–268.

Bydžovská, H. and Brandejs, M. (2014). Towards student success prediction. KDIR 2014 - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, pages 162–169.

Chen, Z., Cen, G., Wei, Y., and Li, Z. (2023). Student performance prediction approach based on educational data mining. IEEE Access, 11:131260–131272.

Crivei, L. M., Ionescu, V.-S., and Czibula, G. (2019). An analysis of supervised learning methods for predicting students’ performance in academic environments. ICIC Express Lett., 13(3):181–189.

Deo, R. C., Yaseen, Z. M., Al-Ansari, N., Nguyen-Huy, T., Langlands, T. A. M., and Galligan, L. (2020). Modern artificial intelligence model development for undergraduate student performance prediction: An investigation on engineering mathematics courses. IEEE Access, 8:136697–136724.

European Commission (2016). Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance).

Falat, L. and Piscová, T. (2022). Predicting gpa of university students with supervised regression machine learning models. Applied Sciences, 12:8403.

Feng, G., Fan, M., and Chen, Y. (2022). Analysis and prediction of students’ academic performance based on educational data mining. IEEE Access, 10:19558–19571.

Gaftandzhieva, S., Talukder, A., Gohain, N., Hussain, S., Theodorou, P., Salal, Y. K., and Doneva, R. (2022). Exploring online activities to predict the final grade of student. Mathematics, 10(20):3758.

Gardner, J. and Brooks, C. (2018). Coenrollment networks and their relationship to grades in undergraduate education. In Proceedings of the 8th International Conference on Learning Analytics and Knowledge, LAK ’18, page 295–304, New York, NY, USA. Association for Computing Machinery.

Ghashout, S., Gdura, Y., and Drawil, N. (2023). Early prediction of students’ academic performance using artificial neural network: A case study in computer engineering department. In 2023 IEEE 3rd International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA), pages 40–45.

Gkontzis, A., Kotsiantis, S., Tsoni, R., and Verykios, V. (2018). An effective la approach to predict student achievement. In Proceedings of the 22nd pan-hellenic conference on informatics, pages 76–81.

Guerrero-Higueras, Á. M., DeCastro-García, N., Rodriguez-Lera, F. J., Matellán, V., and Conde, M. Á. (2019). Predicting academic success through students’ interaction with version control systems. Open Computer Science, 9(1):243–251.

Hanson, M. (2024). College dropout rates. [link]. Acessed in 11/1/2024.

Hashim, A., Akeel, W., and Khalaf, A. (2020). Student performance prediction model based on supervised machine learning algorithms. IOP Conference Series: Materials Science and Engineering, 928:032019.

Hassan, Y., Elkorany, A., and Wassif, K. (2022). Utilizing social clustering-based regression model for predicting student’s gpa. IEEE Access, 10:1–1.

Hellas, A., Liao, S., Ihantola, P., Petersen, A., Ajanovski, V., Gutica, M., Hynninen, T., Knutas, A., Leinonen, J., and Messom, C. (2018). Predicting academic performance: a systematic literature review. pages 175–199.

Iddrisu, I., Appiah, O., Appiahene, P., and Inusah, F. (2023). A systematic review of the literature on machine learning application of determining the attributes influencing academic performance. Decision Analytics Journal, 7:100204.

Iqbal, Z., Qayyum, A., Latif, S., and Qadir, J. (2019). Early student grade prediction: An empirical study. In 2019 2nd International Conference on Advancements in Computational Sciences (ICACS), pages 1–7.

Jiang, W. and Pardos, Z. A. (2021). Towards equity and algorithmic fairness in student grade prediction. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pages 608–617.

Kaensar, C. and Wongnin, W. (2023). Predicting new student performances and identifying important attributes of admission data using machine learning techniques with hyperparameter tuning. Eurasia Journal of Mathematics, Science and Technology Education, 19(12):em2369.

Khan, B., Afzal, S., Rahman, T., Khan, I., Ullah, I., Rehman, A., Baz, M., Hamam, H., and Cheikhrouhou, O. (2021). Student-performulator: Student academic performance using hybrid deep neural network. Sustainability, 13:9775.

Khan, M., Naz, S., Khan, Y., Zafar, M., Khan, M., and Pau, G. (2023). Utilizing machine learning models to predict student performance from lms activity logs. IEEE Access, 11:86953–86962.

Kitchenham, B., Madeyski, L., and Budgen, D. (2023). Segress: Software engineering guidelines for reporting secondary studies. IEEE Transactions on Software Engineering, 49(3):1273–1298.

Mengash, H. A. (2020). Using data mining techniques to predict student performance to support decision making in university admission systems. IEEE Access, 8:55462–55470.

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., Group, P., et al. (2010). Preferred reporting items for systematic reviews and meta-analyses: the prisma statement. International journal of surgery, 8(5):336–341.

Morelli, M., Chirumbolo, A., Baiocco, R., and Cattelino, E. (2023). Self-regulated learning self-efficacy, motivation, and intention to drop-out: The moderating role of friendships at university. Current Psychology, 42(18):15589–15599.

Nabil, A., Seyam, M., and Abou-Elfetouh, A. (2021). Prediction of students’ academic performance based on courses’ grades using deep neural networks. IEEE Access, PP:1–1.

Newsted, P. R. (1975). Grade and ability predictions in an introductory programming course. SIGCSE Bull., 7(2):87–91.

Petersen, K., Vakkalanka, S., and Kuzniarz, L. (2015). Guidelines for conducting systematic mapping studies in software engineering: An update. Information and software technology, 64:1–18.

Pinheiro, C. B., Ribeiro, J. L. L. d. S., and Fernandes, S. A. F. (2023). Modelos teóricos da evasão no ensino superior e notas sobre o contexto nacional. Avaliação: Revista da Avaliação da Educação Superior (Campinas), 28:e023015.

Popescu, E. and Leon, F. (2018). Predicting academic performance based on learner traces in a social learning environment. IEEE Access, 6:72774 – 72785.

Prabowo, H., Hidayat, A. A., Cenggoro, T. W., Rahutomo, R., Purwandari, K., and Pardamean, B. (2021). Aggregating time series and tabular data in deep learning model for university students’ gpa prediction. IEEE Access, PP:1–1.

Proaño, J. P. Z. and Párraga, V. C. V. (2018). Systematic mapping study of literature on educational data mining to determine factors that affect school performance. In 2018 International Conference on Information Systems and Computer Science (INCISCOS), pages 239–245.

Rafique, A., Khan, M. S., Jamal, M. H., Tasadduq, M., Rustam, F., Lee, E., Washington, P. B., and Ashraf, I. (2021). Integrating learning analytics and collaborative learning for improving student’s academic performance. IEEE Access, 9:167812–167826.

Rodrigues., H., Santiago., E., Wanderley., G., Moraes., L., Eduardo Mello., C., Alvares., R., and Santos., R. (2024). Artificial intelligence algorithms to predict college students’ dropout: A systematic mapping study. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, pages 344–351. INSTICC, SciTePress.

Saluja, R., Rai, M., and Saluja, R. (2023). Designing new student performance prediction model using ensemble machine learning. Journal of Autonomous Intelligence, 6:583.

Samsudin, N. A. M., Shaharudin, S. M., Sulaiman, N. A. F., Ismail, S., Mohamed, N. S., and Husin, N. H. M. (2022). Prediction of student‘s academic performance during online learning based on regression in support vector machine. International Journal of Information and Education Technology, 12(12).

Santos, C. M. d. C., Pimenta, C. A. d. M., and Nobre, M. R. C. (2007). A estratégia pico para a construção da pergunta de pesquisa e busca de evidências. Revista latino-americana de enfermagem, 15:508–511.

Silva, D. B. d., Ferre, A. A. d. O., Guimarães, P. d. S., Lima, R. d., and Espindola, I. B. (2022). Evasão no ensino superior público do brasil: estudo de caso da universidade de são paulo. Avaliação: Revista da Avaliação da Educação Superior (Campinas), 27(2):248–259.

Silva, P., Souza, F., and Fagundes, R. (2020). Approaches to predicting educational problems: A systematic mapping. In Proceedings of the XVI Brazilian Symposium on Information Systems, SBSI ’20, New York, NY, USA. Association for Computing Machinery.

Silveira, M., de Souza, L., Brandão, L., and Brandão, A. (2023). Learning analytics to support education for all: Learning from the past. pages 1–8. IEEE.

Stelnicki, A. M., Nordstokke, D. W., and Saklofske, D. H. (2015). Who is the successful university student? an analysis of personal resources. Canadian Journal of Higher Education, 45(2):214–228.

Suleiman, R. and Anane, R. (2022). Institutional data analysis and machine learning prediction of student performance. In 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pages 1480–1485.

Sweeney, M., Lester, J., and Rangwala, H. (2015). Next-term student grade prediction. In 2015 IEEE International Conference on Big Data (Big Data), pages 970–975. IEEE.

Williams, L., Titus, K. J., and Pittman, J. M. (2021). How early is early enough: Correlating student performance with final grades. In Proceedings of the 5th Conference on Computing Education Practice, CEP ’21, page 13–16, New York, NY, USA. Association for Computing Machinery.

Yagci, M. (2022). Educational data mining: prediction of students’ academic performance using machine learning algorithms. Smart Learning Environments, 9.

Yakubu, M. N. and Abubakar, A. M. (2022). Applying machine learning approach to predict students’ performance in higher educational institutions. Kybernetes, 51(2):916–934.

Yanta, S., Thammaboosadee, S., Chanyagorn, P., and Chuckpaiwong, R. (2021). Course performance prediction and evolutionary optimization for undergraduate engineering program towards admission strategic planning. ICIC Express Letters, 15(6):567–573.

Yu, T. and Jo, I.-H. (2014). Educational technology approach toward learning analytics: relationship between student online behavior and learning performance in higher education. pages 269–270.

Zhang, Y., Yun, Y., An, R., Cui, J., Dai, H., and Shang, X. (2021). Educational data mining techniques for student performance prediction: method review and comparison analysis. Frontiers in psychology, 12:698490.

Zhao, L., Chen, K., Song, J., Zhu, X., Sun, J., Caulfield, B., and Namee, B. M. (2021). Academic performance prediction based on multisource, multifeature behavioral data. IEEE Access, 9:5453–5465.

Zulfiker, M., Ety, N., Biswas, A. A., Chakraborty, P., and Rahman, M. (2020). Predicting students’ performance of the private universities of bangladesh using machine learning approaches. International Journal of Advanced Computer Science and Applications, 11:672–679.
Publicado
20/07/2025
RODRIGUES, Henrique S.; MORAES, Laura O.; GARCIA, Ana Cristina Bicharra; ALVARES, Reinaldo Viana; SANTOS, Rodrigo Pereira dos; DELGADO, Carla. Investigating Artificial Intelligence Algorithms to Predict College Students’ Academic Performance: A Systematic Mapping Study. In: WORKSHOP SOBRE EDUCAÇÃO EM COMPUTAÇÃO (WEI), 33. , 2025, Maceió/AL. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 98-112. ISSN 2595-6175. DOI: https://doi.org/10.5753/wei.2025.7173.