Data Mining for Student Outcome Prediction on Moodle: a systematic mapping

  • Igor Moreira Felix Universidade Federal de Goiás (UFG)
  • Ana Paula Ambrósio Universidade Federal de Goiás (UFG)
  • Priscila da Silva Lima Universidade Federal de Goiás (UFG)
  • Jacques Duilio Brancher Universidade Estadual de Londrina (UEL)

Resumo


Virtual learning environments facilitate online learning, generating and storing large amounts of data during the learning/teaching process. This stored data enables extraction of valuable information using data mining. In this article, we present a systematic mapping, containing 42 papers, where data mining techniques are applied to predict students performance using Moodle data. Results show that decision trees are the most used classification approach. Furthermore, students interactions in forums are the main Moodle attribute analyzed by researchers.

Palavras-chave: data mining, student performance prediction, Moodle, decision trees, systematic mapping

Referências

Cambruzzi, W., Rigo, S., and Barbosa, J. (2015). Dropout prediction and reduction in distance education courses with the learning analytics multitrail approach. Journal of Universal Computer Science, 21(1):23–47.

Carmona, C. J., González, P., del Jesús, M. J., Romero, C., and Ventura, S. (2010). Evolutionary algorithms for subgroup discovery applied to e-learning data. In IEEE EDUCON 2010 Conference, pages 983–990.

Černežel, A., Karaktić, S., Brumen, B., and Podgorelec, V. (2014). Predicting Grades Based on Students’ Online Course Activities, pages 108–117. Cham.

Dascalu, M., Popescu, E., Becheru, A., Crossley, S., and Trausan-Matu, S. (2016). Predicting Academic Performance Based on Students’ Blog and Microblog Posts, pages 370–376. Cham.

EDUCAUSE (2014). The Current Ecosystem of Learning Management Systems in Higher Education: Student, Faculty, and IT Perspectives. Accessed 27/07/2016.

Gamulin, J., Gamulin, O., and Kermek, D. (2016). Using Fourier coefficients in time series analysis for student performance prediction in blended learning environments. Expert Systems, 33(2):189–200.

Gasevic, D., Dawson, S., Rogers, T., and Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28:68–84.

Hu, Y.-H., Lo, C.-L., and Shih, S.-P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior, 36:469–478.

Hung, J. L., Wang, M., Wang, S., Abdelrasoul, M., Li, Y., and He, W. (2016). Identifying at-risk students for early interventions? A time-series clustering approach. IEEE Transactions on Emerging Topics in Computing, PP(99):1–1.

Joksimović, S., Gašević, D., Loughin, T. M., Kovanović, V., and Hatala, M. (2015). Learning at distance: Effects of interaction traces on academic achievement. Computers & Education, 87:204–217.

Jovanović, M., Vukicević, M., Milovanović, M., and Minović, M. (2012). Using data mining on student behavior and cognitive style data for improving e-learning systems: a case study. International Journal of Computational Intelligence Systems, 5(3).

Kato, T. and Ishikawa, T. (2013). Detection and Presentation of Failure of Learning from Quiz Responses in Course Management Systems, 64–73. Cham.

Kostopoulos, G., Kotsiantis, S., and Pintelas, P. (2015). Predicting Student Performance in Distance Higher Education Using Semi-supervised Techniques, 259–270. Cham.

Kotsiantis, S., Patriarcheas, K., and Xenos, M. (2010). A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowledge-Based Systems, 23(6):529–535.

Kotsiantis, S. B. (2012). Use of machine learning techniques for educational purposes: a decision support system for forecasting students’ grades. Artificial Intelligence Review.

López, M. I., Romero, C., Ventura, S., and Luna, J. (2012). Classification via clustering for predicting final marks starting from the student participation in forums. In EDM.

Lykourentzou, I., Giannoukos, I., Mpardis, G., Nikolopoulos, V., and Loumos, V. (2009a). Early and dynamic student achievement prediction in e-learning courses using neural networks. Journal of the American Society for Information Science and Technology.

Lykourentzou, I., Giannoukos, I., Nikolopoulos, V., Mpardis, G., and Loumos, V. (2009b). Dropout prediction in e-learning courses through the combination of machine learning techniques. Computers & Education, 53(3):950–965.

Márquez-Vera, C., Cano, A., Romero, C., and Ventura, S. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied Intelligence, 38(3):315–330.

Minaei-Bidgoli, B., Kashy, D. A., Kortemeyer, G., and Punch, W. F. (2003). Predicting student performance: an application of data mining methods with an educational web-based system. In 33rd Annual Frontiers in Education, 2003. FIE 2003., volume 1.

Mogus, A. M., Djurdjevic, I., and Suvak, N. (2012). The impact of student activity in a virtual learning environment on their final mark. Active Learning in Higher Education.

Moodle.org (2018). Moodle Philosophy. [link]. Accessed: 2018-03-06.

Moradi, H., Moradi, S. A., and Kashani, L. (2014). Students’ Performance Prediction Using Multi-Channel Decision Fusion. Cham.

Márquez-Vera, C., Cano, A., Romero, C., Noaman, A. Y. M., Mousa Fardoun, H., and Ventura, S. (2016). Early dropout prediction using data mining: a case study with high school students. Expert Systems, 33(1):107–124.

Neto, F. A. A. and Castro, A. (2015). Elicited and mined rules for dropout prevention in online courses. In 2015 IEEE Frontiers in Education Conference (FIE), pages 1–7.

Obadi, G., Dráždilová, P., Martinovic, J., Slaninová, K., and Snášel, V. (2010). Finding patterns of students’ behavior in synthetic social networks. In 2010 International Conference on Advances in Social Networks Analysis and Mining, pages 411–413.

Pardos, Z. A., Wang, Q. Y., and Trivedi, S. (2012). The real world significance of performance prediction. ICEDM Proceedings, 1(5):192–195.

Petersen, K., Feldt, R., Mujtaba, S., and Mattsson, M. (2008). Systematic mapping studies in software engineering. In Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering, EASE'08, pages 68–77, Swindon, UK. BCS Learning & Development Ltd.

Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., and Ventura, S. (2013a). Web usage mining for predicting final marks of students that use moodle courses. Computer Applications in Engineering Education, 21(1):135–146.

Romero, C., González, P., Ventura, S., del Jesús, M. J., and Herrera, F. (2009). Evolutionary algorithms for subgroup discovery in e-learning: A practical application using moodle data. Expert Syst. Appl., 36:1632–1644.

Romero, C., López, M.-I., Luna, J.-M., and Ventura, S. (2013b). Predicting students’ final performance from participation in on-line discussion forums. Computers & Education.

Romero, C. and Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(6):601–618.

Romero, C., Ventura, S., Espejo, P. G., and Hervás, C. (2008). Data mining algorithms to classify students. In Proc. of the 1st Int. Conf. on Educational Data Mining (EDM’08), p. 187–191, 2008.

Shana, Z. and Abdulla, S. (2015). Educational data mining: an intelligent system to predict student graduation. IRECOS, 10(6):593–601.

Sharma, M. and Mavani, M. (2011a). Accuracy comparison of predictive algorithms of data mining: Application in education sector. Communications in Computer and Information Science, 125 CCIS:189–194.

Sharma, M. and Mavani, M. (2011b). Development of predictive model in education system: Using naive Bayes classifier. In Proceedings of the International Conference - Workshop on Emerging Trends in Technology, ICWET ’11, pages 185–186, New York, NY, USA. ACM.

Sorour, S. E., Mine, T., Goda, K., and Hirokawa, S. (2014). Predicting students’ grades based on free style comments data by artificial neural network. In 2014 IEEE Frontiers in Education Conference (FIE) Proceedings, pages 1–9.

Strang, K. D. (2016). Beyond engagement analytics: which online mixed-data factors predict student learning outcomes? Education and Information Technologies.

Thai-Nghe, N., Busche, A., and Schmidt-Thieme, L. (2009). Improving academic performance prediction by dealing with class imbalance. In 2009 Ninth International Conference on Intelligent Systems Design and Applications, pages 878–883.

Xing, W., Guo, R., Petakovic, E., and Goggins, S. (2015). Participation-based student final performance prediction model through interpretable genetic programming: Integrating learning analytics, educational data mining and theory. Computers in Human Behavior, 47:168–181.

You, J. W. (2016). Identifying significant indicators using LMS data to predict course achievement in online learning. The Internet and Higher Education, 29:23–30.

Zacharis, N. Z. (2015). A multivariate approach to predicting student outcomes in web-enabled blended learning courses. The Internet and Higher Education, 27:44–53.

Zafra, A., Romero, C., and Ventura, S. (2011). Multiple instance learning for classifying students in learning management systems. Expert Systems with Applications, 38(12):15020–15031.

Zafra, A. and Ventura, S. (2009). Predicting student grades in learning management systems with multiple instance genetic programming. In Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, pages 307–314.

Zafra, A. and Ventura, S. (2012). Multi-instance genetic programming for predicting student performance in web based educational environments. Applied Soft Computing, 12(8):2693–2706.

Zorrilla, M. and Garcia-Saiz, D. (2014). Meta-learning: Can it be suitable to automatise the KDD process for the educational domain? Lecture Notes in Computer Science, 285–292.
Publicado
29/10/2018
FELIX, Igor Moreira; AMBRÓSIO, Ana Paula; LIMA, Priscila da Silva; BRANCHER, Jacques Duilio. Data Mining for Student Outcome Prediction on Moodle: a systematic mapping. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 29. , 2018, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 1393-1402. DOI: https://doi.org/10.5753/cbie.sbie.2018.1393.