Avaliação do Impacto de Estratégias de Pré-processamento de Sequências de Eventos de Aprendizagem em Algoritmos de Mineração de Padrões Sequenciais
Resumo
Dados relativos a eventos de aprendizagem, quando contêm atributos temporais, permitem analisar a aprendizagem de fato como um processo ao longo do tempo utilizando, por exemplo, algoritmos de Mineração de Padrões Sequenciais (Sequential Pattern Mining - SPM). No entanto, são escassos na literatura atual trabalhos que avaliam o impacto de estratégias de pré-processamento destas sequências de eventos nos padrões identificados pelos algoritmos. Este estudo investiga o impacto de três estratégias de pré-processamento propostas na literatura nos padrões identificados pelo algoritmo PrefixSpan, utilizando uma base de dados real de cursos à distância oferecidos na plataforma Moodle. Os resultados foram analisados de forma quantitativa e qualitativa e sugerem que a estratégia “Coalescing Repeating Point Events into One” teve o maior impacto na remoção de ruídos, embora o uso conjunto das três estratégias contribuiu para melhorar a qualidade dos padrões detectados.
Referências
Azevedo, A. and Santos, M. F. (2008). KDD, SEMMA AND CRISP-DM: A PARALLEL OVERVIEW. pages 182–185.
Baker, R. S. (2014). Educational data mining: An advance for intelligent systems in education. IEEE Intelligent systems, 29(3):78–82.
Bogarín, A., Cerezo, R., and Romero, C. (2018). A survey on educational process mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(1):e1230.
Calders, T. and Pechenizkiy, M. (2012). Introduction to the special section on educational data mining. Acm Sigkdd Explorations Newsletter, 13(2):3–6.
Chen, G., Rolim, V., Mello, R. F., and Gašević, D. (2020). Let’s shine together! a comparative study between learning analytics and educational data mining. In Proceedings of the Tenth International Conference on Learning Analytics & Knowledge, LAK ’20, pages 544–553, New York, NY, USA. Association for Computing Machinery.
Du, F., Shneiderman, B., Plaisant, C., Malik, S., and Perer, A. (2017). Coping with Volume and Variety in Temporal Event Sequences: Strategies for Sharpening Analytic Focus. IEEE Transactions on Visualization and Computer Graphics, 23(6):1636–1649. Conference Name: IEEE Transactions on Visualization and Computer Graphics.
Fournier Viger, P., lin, j., Rage, U., Koh, Y. S., and Thomas, R. (2017). A Survey of Sequential Pattern Mining. Data Science and Pattern Recognition, 1:54–77.
Guo, Y., Guo, S., Jin, Z., Kaul, S., Gotz, D., and Cao, N. (2022). Survey on Visual Analysis of Event Sequence Data. IEEE Transactions on Visualization and Computer Graphics, 28(12):5091–5112. Conference Name: IEEE Transactions on Visualization and Computer Graphics.
Han, J. and Kamber, M. (2012). Data mining: concepts and techniques. Elsevier, Burlington, MA, 3rd ed edition.
Lo, D., Khoo, S.-C., and Liu, C. (2008). Efficient mining of recurrent rules from a sequence database. In Database Systems for Advanced Applications: 13th International Conference, DASFAA 2008, New Delhi, India, March 19-21, 2008. Proceedings 13, pages 67–83. Springer.
Lockyer, L., Heathcote, E., and Dawson, S. (2013). Informing Pedagogical Action: Aligning Learning Analytics With Learning Design. American Behavioral Scientist, 57(10):1439–1459.
Maranhão, D., Borges, P., and Neto, C. (2023). Descoberta de padrões sequenciais de aprendizagem em um ambiente voltado ao ensino de algoritmos. In Anais do XXXIV Simpósio Brasileiro de Informática na Educação, pages 1385–1396, Porto Alegre, RS, Brasil. SBC.
Munk, M., Drlík, M., Benko, L., and Reichel, J. (2017). Quantitative and qualitative evaluation of sequence patterns found by application of different educational data preprocessing techniques. IEEE Access, 5:8989–9004.
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C. (2001). Prefixspan,: mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings 17th International Conference on Data Engineering, pages 215–224.
Peña-Ayala, A. (2023). Educational Data Science: Essentials, Approaches, and Tendencies. Springer.
Poon, L. K., Kong, S.-C., Wong, M. Y., and Yau, T. S. (2017). Mining sequential patterns of students’ access on learning management system. In Data Mining and Big Data: Second International Conference, DMBD 2017, Fukuoka, Japan, July 27–August 1, 2017, Proceedings 2, pages 191–198. Springer.
Schröer, C., Kruse, F., and Gómez, J. M. (2021). A Systematic Literature Review on Applying CRISP-DM Process Model. Procedia Computer Science, 181:526–534.
Song, W., Ye, W., and Fournier-Viger, P. (2022). Mining sequential patterns with flexible constraints from mooc data. Applied Intelligence, 52(14):16458– 16474.
Verbert, K., Ochoa, X., De Croon, R., Dourado, R. A., and De Laet, T. (2020). Learning analytics dashboards: the past, the present and the future. In Proceedings of the Tenth International Conference on Learning Analytics & Knowledge, LAK ’20, pages 35–40, New York, NY, USA. Association for Computing Machinery.
Wise, A. F. (2019). Learning Analytics: Using Data-Informed Decision-Making to Improve Teaching and Learning. In Adesope, O. O. and Rud, A., editors, Contemporary Technologies in Education, pages 119–143. Springer International Publishing, Cham.
Wise, A. F. and Jung, Y. (2019). Teaching with Analytics: Towards a Situated Model of Instructional Decision-Making. Journal of Learning Analytics, 6(2):53–69–53–69. Number: 2.
Zaki, M. J. (2000). Sequence mining in categorical domains: incorporating constraints. In Proceedings of the Ninth International Conference on Information and Knowledge Management, CIKM ’00, page 422–429, New York, NY, USA. Association for Computing Machinery.
Zhang, Y. and Paquette, L. (2023). Sequential Pattern Mining in Educational Data: The Application Context, Potential, Strengths, and Limitations. In Peña-Ayala, A., editor, Educational Data Science: Essentials, Approaches, and Tendencies: Proactive Education based on Empirical Big Data Evidence, pages 219– 254. Springer Nature, Singapore.