Investigation of text similarity methods in the context of automatic evaluation of discursive questions
Abstract
This article addresses the automated assessment of essay questions, a significant challenge for educators due to their laborious nature. Despite advances in this area, challenges persist such as model fluctuation, the scarcity of datasets in Portuguese and the diversity of methods available without an established standard. The objective of this work is to study and evaluate different textual similarity techniques and tools for the automatic correction of discursive questions, aiming to offer solutions to the aforementioned challenges. In the experiments carried out, an average error of 13.9 points was obtained, considering a scale of 0 to 100, for the best model, which makes it encouraging to use such an approach in the educational context.References
Amade, D., Chandra, R., Sinha, V., and Anand, D. (2024). Automatic text summarization using nltk spacy*. SSRN Electronic Journal.
Bao, H., Wang, Z. X., Cheng, X., Su, Z., Yang, Y.-H., Zhang, G.-Y., Wang, B., and Cai, H.-J. (2022). Using word embeddings to investigate human psychology: Methods and applications. Xinli kexue jinzhan, 31:887–887.
Burstein, J., Leacock, C., and Swartz, R. (2001). Automated evaluation of essays and short answers. Proceedings of the 5th CAA Conference, Loughborough: Loughborough University.
Dande, A. A. and Pund, D. M. A. (2023). A review study on applications of natural language processing. International journal of scientific research in science, engineering and technology.
de Oliveira, D., Pozzebon, E., and Santos, T. (2020). Aplicação das técnicas de processamento de linguagem natural cosine similarity e word movers distance para auxiliar na correção de questões discursivas em um tutor inteligente. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 1243–1252. SBC.
Galhardi, L., de Souza, R., and Brancher, J. (2020). Automatic grading of portuguese short answers using a machine learning approach. In Anais Estendidos do XVI Simpósio Brasileiro de Sistemas de Informação, pages 109–124. SBC.
GOMES, A. S. and PIMENTEL, E. P. (2021). Ambientes virtuais de aprendizagem para uma educação mediada por tecnologias digitais. Informática na Educação: ambientes de aprendizagem, objetos de aprendizagem e empreendedorismo. Porto Alegre: Sociedade Brasileira de Computação.
Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., and Aluisio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025.
Januzaj, Y. and Luma, A. (2022). Cosine similarity - a computing approach to match similarity between higher education programs and job market demands based on maximum number of common words. International Journal of Emerging Technologies in Learning (ijet), pages 258–268.
McIntosh, T. R., Susnjak, T., Liu, T., Watters, P., and Halgamuge, M. N. (2023). From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape. arXiv preprint arXiv:2312.10868.
Mohler, M. and Mihalcea, R. (2009). Text-to-text semantic similarity for automatic short answer grading. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), pages 567–575.
Savytska, L., Sübay, T., Vnukova, N., Bezugla, I., and Pyvovarov, V. (2022). Word2vec model analysis for semantic and morphologic similarities in turkish words. CEUR-WS.
Shields, J. A. E. (2022). Classroom assessment. In International Encyclopedia of Education (Fourth Edition), pages 519–528. Elsevier eBooks.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems. Curran Associates, Inc.
Wang, M. and Hu, F. (2021). The application of nltk library for python natural language processing in corpus research. Theory and Practice in Language Studies.
Yamagiwa, H., Yokoi, S., and Shimodaira, H. (2022). Improving word mover’s distance by leveraging self-attention matrix. arXiv preprint arXiv:2211.06229.
Bao, H., Wang, Z. X., Cheng, X., Su, Z., Yang, Y.-H., Zhang, G.-Y., Wang, B., and Cai, H.-J. (2022). Using word embeddings to investigate human psychology: Methods and applications. Xinli kexue jinzhan, 31:887–887.
Burstein, J., Leacock, C., and Swartz, R. (2001). Automated evaluation of essays and short answers. Proceedings of the 5th CAA Conference, Loughborough: Loughborough University.
Dande, A. A. and Pund, D. M. A. (2023). A review study on applications of natural language processing. International journal of scientific research in science, engineering and technology.
de Oliveira, D., Pozzebon, E., and Santos, T. (2020). Aplicação das técnicas de processamento de linguagem natural cosine similarity e word movers distance para auxiliar na correção de questões discursivas em um tutor inteligente. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 1243–1252. SBC.
Galhardi, L., de Souza, R., and Brancher, J. (2020). Automatic grading of portuguese short answers using a machine learning approach. In Anais Estendidos do XVI Simpósio Brasileiro de Sistemas de Informação, pages 109–124. SBC.
GOMES, A. S. and PIMENTEL, E. P. (2021). Ambientes virtuais de aprendizagem para uma educação mediada por tecnologias digitais. Informática na Educação: ambientes de aprendizagem, objetos de aprendizagem e empreendedorismo. Porto Alegre: Sociedade Brasileira de Computação.
Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., and Aluisio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025.
Januzaj, Y. and Luma, A. (2022). Cosine similarity - a computing approach to match similarity between higher education programs and job market demands based on maximum number of common words. International Journal of Emerging Technologies in Learning (ijet), pages 258–268.
McIntosh, T. R., Susnjak, T., Liu, T., Watters, P., and Halgamuge, M. N. (2023). From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape. arXiv preprint arXiv:2312.10868.
Mohler, M. and Mihalcea, R. (2009). Text-to-text semantic similarity for automatic short answer grading. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), pages 567–575.
Savytska, L., Sübay, T., Vnukova, N., Bezugla, I., and Pyvovarov, V. (2022). Word2vec model analysis for semantic and morphologic similarities in turkish words. CEUR-WS.
Shields, J. A. E. (2022). Classroom assessment. In International Encyclopedia of Education (Fourth Edition), pages 519–528. Elsevier eBooks.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. u., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems. Curran Associates, Inc.
Wang, M. and Hu, F. (2021). The application of nltk library for python natural language processing in corpus research. Theory and Practice in Language Studies.
Yamagiwa, H., Yokoi, S., and Shimodaira, H. (2022). Improving word mover’s distance by leveraging self-attention matrix. arXiv preprint arXiv:2211.06229.
Published
2024-09-11
How to Cite
ALMEIDA, José Augusto O. da S.; MOURA, Raimundo Santos.
Investigation of text similarity methods in the context of automatic evaluation of discursive questions. In: REGIONAL SCHOOL ON COMPUTING OF CEARÁ, MARANHÃO, AND PIAUÍ (ERCEMAPI), 12. , 2024, Parnaíba/PI.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 110-118.
DOI: https://doi.org/10.5753/ercemapi.2024.243606.
