Using Principal Component Analysis to support students' performance prediction and data analysis
Resumo
We propose a method based on Principal Component Analysis (PCA) for predicting students’ performances and for identifying relevant patterns concerning their characteristics. The proposed method allowed us to study the predictive capability of students’ performances and the effectiveness of PCA for interpreting patterns in educational data. The proposed method was validated using two public datasets describing students achievements, as well as their social and personal characteristics. Experiments were conducted by comparing the predictive performances between the datasets presenting high and reduced dimensions. The results reported that PCA retained relevant information of data and was useful for identifying implicit knowledge in students’ data.
Referências
Asif, R., Merceron, A., Ali, S. A., and Haider, N. G. (2017). Analyzing undergraduate students' performance using educational data mining. Computers & Education, 113:177-194.
Baker, R. S. (2014). Educational data mining: An advance for intelligent systems in education. IEEE Intelligent Systems, 29(3):78-82.
Baradwaj, B. K. and Pal, S. (2012). Mining educational data to analyze students' performance. arXiv preprint arXiv:1201.3417.
Cortez, P. and Silva, A. M. G. (2008). Using data mining to predict secondary school student performance. Universidade do Minho, Portugal.
Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., and Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses. Computers in Human Behavior, 73:247-256.
Duforet-Frebourg, N., Luu, K., Laval, G., Bazin, E., and Blum, M. G. (2015). Detecting genomic signatures of natural selection with principal component analysis: application to the 1000 genomes data. Molecular Biology and Evolution, 33(4):1082-1093.
Dutt, A., Aghabozorgi, S., Ismail, M. A. B., and Mahroeian, H. (2015). Clustering algorithms applied in educational data mining. International Journal of Information and Electronics Engineering, 5(2):112.
Filmer, D. and Pritchett, L. H. (2001). Estimating wealth effects without expenditure data - or tears: an application to educational enrollments in states of India. Demography, 38(1):115-132.
Jolliffe, I. T. (1986). Principal component analysis and factor analysis. In Principal Component Analysis, pages 115-128. Springer.
Linden, R., Barbosa, L. F., and Digiampietri, L. A. (2017). "Brazilian style science" - an analysis of the difference between Brazilian and international computer science departments and graduate programs using social network analysis and bibliometrics. Social Network Analysis and Mining, 7(1):44.
Polat, K. and Gunes, S. (2007). Detection of ECG arrhythmia using a differential expert system approach based on principal component analysis and least square support vector machine. Applied Mathematics and Computation, 186(1):898-906.
Romero, C. and Ventura, S. (2010). Educational data mining: a review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(6):601-618.
Shahiri, A. M. and Husain, W. (2015). A review on predicting student's performance using data mining techniques. Procedia Computer Science, 72:414-422.
Tan, P.-N., Steinbach, M., and Kumar, V. (2005). Introduction to Data Mining. Addison Wesley, US edition.
Wall, M. E., Rechtsteiner, A., and Rocha, L. M. (2003). Singular value decomposition and principal component analysis. In A Practical Approach to Microarray Data Analysis, pages 91-109. Springer.
Witten, I. H., Frank, E., Hall, M. A., and Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
