Clusterização de soluções de exercícios de programação: um mapeamento sistemático da literatura
Resumo
Em disciplinas de programação, alguns grupos de alunos podem adotar estratégias semelhantes ao solucionar os exercícios de programação propostos pelo professor ou professora. Agrupar os códigos dos alunos de acordo com as estratégias adotadas pode fornecer insights valiosos sobre os alunos e sobre as turmas como um todo. No entanto, conduzir esse agrupamento de forma manual é trabalhoso, e por isso alguns trabalhos da literatura exploraram abordagens automáticas de agrupamento de códigos de acordo com as estratégias adotadas para solucionar os exercícios. Diante disso, este artigo apresenta um Mapeamento Sistemático da Literatura (MSL) sobre o uso de técnicas de clustering aplicadas a soluções de exercícios de programação. Foram identificados 22 artigos, onde as motivações para a aplicação de clusterização incluíram a geração de feedback personalizado e a identificação de erros comuns entre os estudantes.
Palavras-chave:
programação, clusterização, mapeamento
Referências
Ankerst, M., Breunig, M. M., Kriegel, H.-P., and Sander, J. (1999). Optics: Ordering points to identify the clustering structure. ACM Sigmod record, 28(2):49–60.
Barbosa, A. d. A., Costa, E. d. B., and Brito, P. H. (2018). Adaptive clustering of codes for assessment in introductory programming courses. In Intelligent Tutoring Systems: 14th International Conference, ITS 2018, Montreal, QC, Canada, June 11–15, 2018, Proceedings 14, pages 13–22. Springer.
Barbosa, A. d. A., de Barros Costa, E., and Brito, P. H. (2023). Juízes online são suficientes ou precisamos de um var? In Anais do III Simpósio Brasileiro de Educação em Computação, pages 386–394. SBC.
Beh, M. Y., Gottipatti, S., LO, D., and Shankararaman, V. (2016). Semi-automated tool for providing effective feedback on programming assignments.
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer google scholar, 2:1122–1128.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37–46.
Combéfis, S. and Schils, A. (2016). Automatic programming error class identification with code plagiarism-based clustering. In Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering, pages 1–6.
Effenberger, T. and Pelánek, R. (2021). Interpretable clustering of students’ solutions in introductory programming. In International Conference on Artificial Intelligence in Education, pages 101–112. Springer.
Emerson, A., Smith, A., Rodriguez, F. J., Wiebe, E. N., Mott, B. W., Boyer, K. E., and Lester, J. C. (2020). Cluster-based analysis of novice coding misconceptions in block-based programming. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education, pages 825–831.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd, volume 96, pages 226–231.
Fu, Y., Osei-Owusu, J., Astorga, A., Zhao, Z. N., Zhang, W., and Xie, T. (2021). Pacon: a symbolic analysis approach for tactic-oriented clustering of programming submissions. In Proceedings of the 2021 ACM SIGPLAN International Symposium on SPLASH-E, pages 32–42.
Gao, L., Wan, B., Fang, C., Li, Y., and Chen, C. (2019). Automatic clustering of different solutions to programming assignments in computing education. In Proceedings of the ACM Conference on Global Computing Education, pages 164–170.
Glassman, E. L., Scott, J., Singh, R., Guo, P. J., and Miller, R. C. (2015). Overcode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction (TOCHI), 22(2):1–35.
Head, A., Glassman, E., Soares, G., Suzuki, R., Figueredo, L., D’Antoni, L., and Hartmann, B. (2017). Writing reusable code feedback at scale with mixed-initiative program synthesis. In Proceedings of the Fourth (2017) ACM Conference on Learning@Scale, pages 89–98.
Jain, A. K. and Dubes, R. C. (1988). Algorithms for clustering data, Prentice-Hall, Inc.
Joyner, D., Arrison, R., Ruksana, M., Salguero, E., Wang, Z., Wellington, B., and Yin, K. (2019). From clusters to content: Using code clustering for course improvement. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education, pages 780–786.
Kaufman, L. and Rousseeuw, P. J. (2009). Finding groups in data: an introduction to cluster analysis. John Wiley & Sons.
Kawabayashi, S., Rahman, M. M., and Watanobe, Y. (2021). A model for identifying frequent errors in incorrect solutions. In 2021 10th International Conference on Educational and Information Technology (ICEIT), pages 258–263. IEEE.
Kitchenham, B., Madeyski, L., and Budgen, D. (2022). Segress: Software engineering guidelines for reporting secondary studies. IEEE Transactions on Software Engineering, 49(3):1273–1298.
Koivisto, T. and Hellas, A. (2022). Evaluating codeclusters for effectively providing feedback on code submissions. In 2022 IEEE Frontiers in Education Conference (FIE), pages 1–9. IEEE.
Lokkila, E., Christopoulos, A., and Laakso, M.-J. (2022). A clustering method to detect disengaged students from their code submission history. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1, pages 228–234.
Luo, L. and Zeng, Q. (2016). Solminer: mining distinct solutions in programs. In Proceedings of the 38th International Conference on Software Engineering Companion, pages 481–490.
MacQueen, J. et al. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA.
Pessoa, M., Lima, M., Pires, F., Haydar, G., Melo, R., Rodrigues, L., Oliveira, D., Oliveira, E., Galvão, L., Gadelha, B., et al. (2023). A journey to identify users’ classification strategies to customize game-based and gamified learning environments. IEEE Transactions on Learning Technologies.
Rahman, M. M., Watanobe, Y., Matsumoto, T., Kiran, R. U., and Nakamura, K. (2022). Educational data mining to support programming learning using problem-solving data. IEEE Access, 10:26186–26202.
Rahman, M. M., Watanobe, Y., Rage, U. K., and Nakamura, K. (2021). A novel rule-based online judge recommender system to promote computer programming education. In Advances and Trends in Artificial Intelligence. From Theory to Practice: 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021, Kuala Lumpur, Malaysia, July 26–29, 2021, Proceedings, Part II 34, pages 15–27. Springer.
Rosales-Castro, L. F., Chaparro-Gutiérrez, L. A., Cruz-Salinas, A. F., Restrepo-Calle, F., Camargo, J., and González, F. A. (2016). An interactive tool to support student assessment in programming assignments. In Advances in Artificial Intelligence-IBERAMIA 2016: 15th Ibero-American Conference on AI, San José, Costa Rica, November 23-25, 2016, Proceedings 15, pages 404–414. Springer.
Silva, D. B., Carvalho, D. R., and Silla, C. N. (2023). A clustering-based computational model to group students with similar programming skills from automatic source code analysis using novel features. IEEE Transactions on Learning Technologies.
Silva, D. B. and Silla, C. N. (2020). Evaluation of students programming skills on a computer programming course with a hierarchical clustering algorithm. In 2020 IEEE Frontiers in Education Conference (FIE), pages 1–9. IEEE.
Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and computing, 17:395–416.
Ward Jr, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301):236–244.
Wasik, S., Antczak, M., Badura, J., Laskowski, A., and Sternal, T. (2018). A survey on online judge systems and their applications. ACM Computing Surveys (CSUR), 51(1):1–34.
Xu, D. and Tian, Y. (2015). A comprehensive survey of clustering algorithms. Annals of data science, 2:165–193.
Yin, H., Moghadam, J., and Fox, A. (2015). Clustering student programming assignments to multiply instructor leverage. In Proceedings of the second (2015) ACM conference on learning@scale, pages 367–372.
Barbosa, A. d. A., Costa, E. d. B., and Brito, P. H. (2018). Adaptive clustering of codes for assessment in introductory programming courses. In Intelligent Tutoring Systems: 14th International Conference, ITS 2018, Montreal, QC, Canada, June 11–15, 2018, Proceedings 14, pages 13–22. Springer.
Barbosa, A. d. A., de Barros Costa, E., and Brito, P. H. (2023). Juízes online são suficientes ou precisamos de um var? In Anais do III Simpósio Brasileiro de Educação em Computação, pages 386–394. SBC.
Beh, M. Y., Gottipatti, S., LO, D., and Shankararaman, V. (2016). Semi-automated tool for providing effective feedback on programming assignments.
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer google scholar, 2:1122–1128.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37–46.
Combéfis, S. and Schils, A. (2016). Automatic programming error class identification with code plagiarism-based clustering. In Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering, pages 1–6.
Effenberger, T. and Pelánek, R. (2021). Interpretable clustering of students’ solutions in introductory programming. In International Conference on Artificial Intelligence in Education, pages 101–112. Springer.
Emerson, A., Smith, A., Rodriguez, F. J., Wiebe, E. N., Mott, B. W., Boyer, K. E., and Lester, J. C. (2020). Cluster-based analysis of novice coding misconceptions in block-based programming. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education, pages 825–831.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd, volume 96, pages 226–231.
Fu, Y., Osei-Owusu, J., Astorga, A., Zhao, Z. N., Zhang, W., and Xie, T. (2021). Pacon: a symbolic analysis approach for tactic-oriented clustering of programming submissions. In Proceedings of the 2021 ACM SIGPLAN International Symposium on SPLASH-E, pages 32–42.
Gao, L., Wan, B., Fang, C., Li, Y., and Chen, C. (2019). Automatic clustering of different solutions to programming assignments in computing education. In Proceedings of the ACM Conference on Global Computing Education, pages 164–170.
Glassman, E. L., Scott, J., Singh, R., Guo, P. J., and Miller, R. C. (2015). Overcode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction (TOCHI), 22(2):1–35.
Head, A., Glassman, E., Soares, G., Suzuki, R., Figueredo, L., D’Antoni, L., and Hartmann, B. (2017). Writing reusable code feedback at scale with mixed-initiative program synthesis. In Proceedings of the Fourth (2017) ACM Conference on Learning@Scale, pages 89–98.
Jain, A. K. and Dubes, R. C. (1988). Algorithms for clustering data, Prentice-Hall, Inc.
Joyner, D., Arrison, R., Ruksana, M., Salguero, E., Wang, Z., Wellington, B., and Yin, K. (2019). From clusters to content: Using code clustering for course improvement. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education, pages 780–786.
Kaufman, L. and Rousseeuw, P. J. (2009). Finding groups in data: an introduction to cluster analysis. John Wiley & Sons.
Kawabayashi, S., Rahman, M. M., and Watanobe, Y. (2021). A model for identifying frequent errors in incorrect solutions. In 2021 10th International Conference on Educational and Information Technology (ICEIT), pages 258–263. IEEE.
Kitchenham, B., Madeyski, L., and Budgen, D. (2022). Segress: Software engineering guidelines for reporting secondary studies. IEEE Transactions on Software Engineering, 49(3):1273–1298.
Koivisto, T. and Hellas, A. (2022). Evaluating codeclusters for effectively providing feedback on code submissions. In 2022 IEEE Frontiers in Education Conference (FIE), pages 1–9. IEEE.
Lokkila, E., Christopoulos, A., and Laakso, M.-J. (2022). A clustering method to detect disengaged students from their code submission history. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1, pages 228–234.
Luo, L. and Zeng, Q. (2016). Solminer: mining distinct solutions in programs. In Proceedings of the 38th International Conference on Software Engineering Companion, pages 481–490.
MacQueen, J. et al. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA.
Pessoa, M., Lima, M., Pires, F., Haydar, G., Melo, R., Rodrigues, L., Oliveira, D., Oliveira, E., Galvão, L., Gadelha, B., et al. (2023). A journey to identify users’ classification strategies to customize game-based and gamified learning environments. IEEE Transactions on Learning Technologies.
Rahman, M. M., Watanobe, Y., Matsumoto, T., Kiran, R. U., and Nakamura, K. (2022). Educational data mining to support programming learning using problem-solving data. IEEE Access, 10:26186–26202.
Rahman, M. M., Watanobe, Y., Rage, U. K., and Nakamura, K. (2021). A novel rule-based online judge recommender system to promote computer programming education. In Advances and Trends in Artificial Intelligence. From Theory to Practice: 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021, Kuala Lumpur, Malaysia, July 26–29, 2021, Proceedings, Part II 34, pages 15–27. Springer.
Rosales-Castro, L. F., Chaparro-Gutiérrez, L. A., Cruz-Salinas, A. F., Restrepo-Calle, F., Camargo, J., and González, F. A. (2016). An interactive tool to support student assessment in programming assignments. In Advances in Artificial Intelligence-IBERAMIA 2016: 15th Ibero-American Conference on AI, San José, Costa Rica, November 23-25, 2016, Proceedings 15, pages 404–414. Springer.
Silva, D. B., Carvalho, D. R., and Silla, C. N. (2023). A clustering-based computational model to group students with similar programming skills from automatic source code analysis using novel features. IEEE Transactions on Learning Technologies.
Silva, D. B. and Silla, C. N. (2020). Evaluation of students programming skills on a computer programming course with a hierarchical clustering algorithm. In 2020 IEEE Frontiers in Education Conference (FIE), pages 1–9. IEEE.
Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and computing, 17:395–416.
Ward Jr, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American statistical association, 58(301):236–244.
Wasik, S., Antczak, M., Badura, J., Laskowski, A., and Sternal, T. (2018). A survey on online judge systems and their applications. ACM Computing Surveys (CSUR), 51(1):1–34.
Xu, D. and Tian, Y. (2015). A comprehensive survey of clustering algorithms. Annals of data science, 2:165–193.
Yin, H., Moghadam, J., and Fox, A. (2015). Clustering student programming assignments to multiply instructor leverage. In Proceedings of the second (2015) ACM conference on learning@scale, pages 367–372.
Publicado
04/11/2024
Como Citar
MELO, Rafaela; PESSOA, Marcela; FERNANDES, David.
Clusterização de soluções de exercícios de programação: um mapeamento sistemático da literatura. In: SIMPÓSIO BRASILEIRO DE INFORMÁTICA NA EDUCAÇÃO (SBIE), 35. , 2024, Rio de Janeiro/RJ.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 1715-1729.
DOI: https://doi.org/10.5753/sbie.2024.242403.