Exploring Self-Regulated Learning in Virtual Environments: An Experimental Clustering-Based Approach
Resumo
Following the COVID-19 pandemic, there was a substantial increase in the volume of educational data generated in online environments. In this context, this study investigates signs of self-regulated learning (SRL) in virtual environments by applying educational data mining techniques to analyze student behavior. Data were collected from Moodle logs of a technical course offered by a federal public educational institution and underwent a preprocessing phase. Clustering algorithms such as K-Means, HDBSCAN, and Agglomerative Clustering were then applied to identify behavior patterns related to SRL. Differing from previous studies that mainly focused on student profiling or general engagement-performance correlations, this research explores how behavioral patterns revealed by clustering are directly associated with SRL indicators, with the results showing that HDBSCAN and K-Means were more effective in forming meaningful groups. The analysis revealed that students who exhibited stronger indications of SRL tended to achieve better academic performance, demonstrating greater engagement with learning resources, which was reflected in higher grades. This study contributes to a more nuanced understanding of SRL dynamics in virtual environments and highlights the potential of educational data mining techniques in identifying relevant behaviors, offering valuable insights for the development of pedagogical practices that promote student autonomy.Referências
Aldowah, H., Al-Samarraie, H., and Fauzy, W. M. (2019). Educational data mining and learning analytics for 21st century higher education: A review and synthesis. Telematics and Informatics, 37:13–49.
Cavalcanti, A., Dourado, R., Rodrigues, R., Alves, N., Silva, J., and Ramos, J. L. C. (2018). An analysis of self-regulated learning behavioral diversity in different scenarios in distance learning courses. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE), volume 29, page 1493.
Costa, J., Dorça, F., and Araújo, R. (2020). Avaliação do comportamento de estudantes em um ambiente educacional ubíquo. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 182–191, Porto Alegre, RS, Brasil. SBC.
Damayanti, A., Kusumawardani, S. S., and Wibirama, S. (2023). A review of learners’ self-regulated learning behavior analysis using log-data traces. In 2023 IEEE 12th International Conference on Engineering Education (ICEED), pages 90–95. IEEE.
Davies, R., Allen, G., Albrecht, C., Bakir, N., and Ball, N. (2021). Using educational data mining to identify and analyze student learning strategies in an online flipped classroom. Education Sciences, 11(11):668.
De Winter, J. C., Gosling, S. D., and Potter, J. (2016). Comparing the pearson and spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychological methods, 21(3):273.
Dinh, D.-T., Fujinami, T., and Huynh, V.-N. (2019). Estimating the optimal number of clusters in categorical data clustering by silhouette coefficient. In International Symposium on Knowledge and Systems Sciences, pages 1–17. Springer.
Farida, A. and Sudibyo, N. A. (2022). Implementation of the k-means algorithm on learning outcomes and self-regulated learning. UNION: Jurnal Ilmiah Pendidikan Matematika, 10(2):147–154.
Furlanetto, G., Carvalho, V., Baldassin, A., and Manacero, A. (2022). Algoritmos de agrupamento aplicados à detecção de fraudes. In Anais da XIII Escola Regional de Alto Desempenho de São Paulo, pages 29–32, Porto Alegre, RS, Brasil. SBC.
McKinney, W. et al. (2010). Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, volume 445, pages 51–56. Austin, TX.
Moodle (2024). Registered Moodle sites. Registered Moodle sites. Disponível em [link].
Nuankaew, P., Nasa-Ngium, P., and Nuankaew, W. S. (2022). Self-regulated learning styles in hybrid learning using educational data mining analysis. In 2022 26th International Computer Science and Engineering Conference (ICSEC), pages 208–212. IEEE.
Panadero, E. (2017). A Review of Self-regulated Learning: Six Models and Four Directions for Research. Frontiers in Psychology, 8:422.
Peraić, I. and Grubišić, A. (2023). Exploring student engagement in online programming courses: A two-level k-means analysis. In 2023 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), pages 1–6. IEEE.
Ramos, J., Santos, L., Silva, J., and Rodrigues, R. (2020). Identificação de perfis de interação de estudantes de educação a distância por meio de técnicas de agrupamentos. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 932–941, Porto Alegre, RS, Brasil. SBC.
Rodriguez, F., Lee, H. R., Rutherford, T., Fischer, C., Potma, E., and Warschauer, M. (2021). Using clickstream data mining techniques to understand and support first-generation college students in an online chemistry course. In LAK21: 11th International Learning Analytics and Knowledge Conference, pages 313–322.
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65.
Salloum, S. A., Alshurideh, M., Elnagar, A., and Shaalan, K. (2020). Mining in educational data: review and future directions. In The International Conference on Artificial Intelligence and Computer Vision, pages 92–102. Springer.
Shaun, R., Baker, J., Isotani, S., Maria, A., and Carvalho, J. (2011). Mineração de dados educacionais: Oportunidades para o brasil. Revista Brasileira de Informática na Educação, 19:3–13.
Spearman, C. (1961). The proof and measurement of association between two things. The American Journal of Psychology, 100(3/4):441–471.
Urdan, T. (2010). Statistics in Plain English, Third Edition. Taylor & Francis.
World Health Organization (2020). WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. World Health Organization. Available at [link].
Zar, J. H. (2005). Spearman rank correlation. Encyclopedia of biostatistics, 7.
Zimmerman, B. and Martinez-Pons, M. (1986). Development of a structured interview for assessing student use of self-regulated learning strategies. American Educational Research Journal, 23:614–628.
Zimmerman, B. J. (2000). Attaining self-regulation: A social cognitive perspective. In Handbook of self-regulation, pages 13–39. Elsevier.
Cavalcanti, A., Dourado, R., Rodrigues, R., Alves, N., Silva, J., and Ramos, J. L. C. (2018). An analysis of self-regulated learning behavioral diversity in different scenarios in distance learning courses. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE), volume 29, page 1493.
Costa, J., Dorça, F., and Araújo, R. (2020). Avaliação do comportamento de estudantes em um ambiente educacional ubíquo. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 182–191, Porto Alegre, RS, Brasil. SBC.
Damayanti, A., Kusumawardani, S. S., and Wibirama, S. (2023). A review of learners’ self-regulated learning behavior analysis using log-data traces. In 2023 IEEE 12th International Conference on Engineering Education (ICEED), pages 90–95. IEEE.
Davies, R., Allen, G., Albrecht, C., Bakir, N., and Ball, N. (2021). Using educational data mining to identify and analyze student learning strategies in an online flipped classroom. Education Sciences, 11(11):668.
De Winter, J. C., Gosling, S. D., and Potter, J. (2016). Comparing the pearson and spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychological methods, 21(3):273.
Dinh, D.-T., Fujinami, T., and Huynh, V.-N. (2019). Estimating the optimal number of clusters in categorical data clustering by silhouette coefficient. In International Symposium on Knowledge and Systems Sciences, pages 1–17. Springer.
Farida, A. and Sudibyo, N. A. (2022). Implementation of the k-means algorithm on learning outcomes and self-regulated learning. UNION: Jurnal Ilmiah Pendidikan Matematika, 10(2):147–154.
Furlanetto, G., Carvalho, V., Baldassin, A., and Manacero, A. (2022). Algoritmos de agrupamento aplicados à detecção de fraudes. In Anais da XIII Escola Regional de Alto Desempenho de São Paulo, pages 29–32, Porto Alegre, RS, Brasil. SBC.
McKinney, W. et al. (2010). Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, volume 445, pages 51–56. Austin, TX.
Moodle (2024). Registered Moodle sites. Registered Moodle sites. Disponível em [link].
Nuankaew, P., Nasa-Ngium, P., and Nuankaew, W. S. (2022). Self-regulated learning styles in hybrid learning using educational data mining analysis. In 2022 26th International Computer Science and Engineering Conference (ICSEC), pages 208–212. IEEE.
Panadero, E. (2017). A Review of Self-regulated Learning: Six Models and Four Directions for Research. Frontiers in Psychology, 8:422.
Peraić, I. and Grubišić, A. (2023). Exploring student engagement in online programming courses: A two-level k-means analysis. In 2023 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), pages 1–6. IEEE.
Ramos, J., Santos, L., Silva, J., and Rodrigues, R. (2020). Identificação de perfis de interação de estudantes de educação a distância por meio de técnicas de agrupamentos. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 932–941, Porto Alegre, RS, Brasil. SBC.
Rodriguez, F., Lee, H. R., Rutherford, T., Fischer, C., Potma, E., and Warschauer, M. (2021). Using clickstream data mining techniques to understand and support first-generation college students in an online chemistry course. In LAK21: 11th International Learning Analytics and Knowledge Conference, pages 313–322.
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65.
Salloum, S. A., Alshurideh, M., Elnagar, A., and Shaalan, K. (2020). Mining in educational data: review and future directions. In The International Conference on Artificial Intelligence and Computer Vision, pages 92–102. Springer.
Shaun, R., Baker, J., Isotani, S., Maria, A., and Carvalho, J. (2011). Mineração de dados educacionais: Oportunidades para o brasil. Revista Brasileira de Informática na Educação, 19:3–13.
Spearman, C. (1961). The proof and measurement of association between two things. The American Journal of Psychology, 100(3/4):441–471.
Urdan, T. (2010). Statistics in Plain English, Third Edition. Taylor & Francis.
World Health Organization (2020). WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. World Health Organization. Available at [link].
Zar, J. H. (2005). Spearman rank correlation. Encyclopedia of biostatistics, 7.
Zimmerman, B. and Martinez-Pons, M. (1986). Development of a structured interview for assessing student use of self-regulated learning strategies. American Educational Research Journal, 23:614–628.
Zimmerman, B. J. (2000). Attaining self-regulation: A social cognitive perspective. In Handbook of self-regulation, pages 13–39. Elsevier.
Publicado
24/11/2025
Como Citar
COSTA, Juliete A. R; LIMA, Geycy D. O.; ARAÚJO, Rafael D.; DORÇA, Fabiano A..
Exploring Self-Regulated Learning in Virtual Environments: An Experimental Clustering-Based Approach. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 36. , 2025, Curitiba/PR.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 469-482.
DOI: https://doi.org/10.5753/sbie.2025.12462.
