Automated Thematic Coherence Scoring of Student Essays Written in Portuguese

Rafael Pacheco; Luiz Rodrigues; Lucas Lins; Péricles Miranda; Valmir Macário; Seiji Isotani; Thiago Cordeiro; Ig Ibert Bittencourt; Diego Dermeval; Dragan Gašević; Rafael Ferreira Mello

doi:10.5753/sbie.2023.233447

Rafael Pacheco UFPE
Luiz Rodrigues UFAL
Lucas Lins UFRPE
Péricles Miranda UFRPE
Valmir Macário UFRPE
Seiji Isotani USP / Harvard Graduate School of Education
Thiago Cordeiro UFAL
Ig Ibert Bittencourt UFAL / Harvard Graduate School of Education
Diego Dermeval UFAL
Dragan Gašević Monash University
Rafael Ferreira Mello CESAR School / Monash University

DOI: https://doi.org/10.5753/sbie.2023.233447

Resumo

While Thematic Coherence is a fundamental aspect of essay writing, scoring it is labor-intensive. This issue is often addressed using machine learning algorithms to estimate the score. However, related work is mostly limited to the English language or argumentative essays. Consequently, there is a lack of research on other widely used languages and essay types, such as Brazilian Portuguese and narrative essays. Hence, this paper reports on the findings of a study that aimed to evaluate the value of machine learning algorithms to automatically score the Thematic Coherence of both narratives (n = 400) and argumentative (n = 6567) essays written in Brazilian Portuguese. Expanding on previous studies, this paper evaluated regression models using conventional, feature-based algorithms according to essays’ linguistic features. Overall, we found that Extra Trees was the best performing algorithm, yielding predictions with moderate to strong correlations with human-generated scores. Mainly, those findings expand the literature with evidence on the potential of machine learning to estimate the Thematic Coherence of narrative and argumentative essays, suggest an improved performance for the former type.

Referências

Bai, X. and Stede, M. (2022). A survey of current machine learning approaches to student free-text evaluation for intelligent tutoring. International Journal of Artificial Intelligence in Education, pages 1–39.

Basu, S., Jacobs, C., and Vanderwende, L. (2013). Powergrading: a clustering approach to amplify human effort for short answer grading. Transactions of the Association for Computational Linguistics, 1:391–402.

Burstein, J., Marcu, D., and Knight, K. (2003). Finding the write stuff: Automatic identification of discourse structure in student essays. IEEE Intelligent Systems, 18(1):32–39.

C. Marinho, J., T. Anchiêta, R., and S. Moura, R. (2022). Essay-br: a brazilian corpus to automatic essay scoring task. Journal of Information and Data Management, 13(1).

Camelo, R., Justino, S., and de Mello, R. F. L. (2020). Coh-metrix pt-br: Uma api web de análise textual para a educação. In Anais dos Workshops do IX Congresso Brasileiro de Informática na Educação, pages 179–186. SBC.

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357.

Costa, L., de Oliveira, E. H. T., and Júnior, A. C. (2020). Corretor automático de redações em língua portuguesa: um mapeamento sistemático de literatura. In Anais do XXXI Simpósio Brasileiro de Informática na Educação, pages 1403–1412. SBC.

Fernández-Delgado, M., Sirsat, M. S., Cernadas, E., Alawadi, S., Barro, S., and Febrero-Bande, M. (2019). An extensive experimental survey of regression methods. Neural Networks, 111:11–34.

Ferreira-Mello, R., André, M., Pinheiro, A., Costa, E., and Romero, C. (2019). Text mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(6):e1332.

Ferreira Mello, R., Fiorentino, G., Oliveira, H., Miranda, P., Rakovic, M., and Gasevic, D. (2022). Towards automated content analysis of rhetorical structure of written essays using sequential content-independent features in portuguese. In LAK22: 12th International Learning Analytics and Knowledge Conference, pages 404–414.

Filho, A. H., Concatto, F., do Prado, H. A., and Ferneda, E. (2021). Comparing feature engineering and deep learning methods for automated essay scoring of brazilian national high school examination.

Graesser, A. C., McNamara, D. S., Louwerse, M. M., and Cai, Z. (2004). Coh-metrix: Analysis of text on cohesion and language. Behavior research methods, instruments, & computers, 36(2):193–202.

Guinaudeau, C. and Strube, M. (2013). Graph-based local coherence modeling. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 93–103, Sofia, Bulgaria. Association for Computational Linguistics.

Haendchen Filho, A., do Prado, H. A., Ferneda, E., and Nau, J. (2018). An approach to evaluate adherence to the theme and the argumentative structure of essays. Procedia Computer Science, 126:788–797.

Honnibal, M., Montani, I., Van Landeghem, S., and Boyd, A. (2020). spaCy: Industrial-strength Natural Language Processing in Python.

Júnior, C. R., Spalenza, M. A., and de Oliveira, E. (2017). Proposta de um sistema de avaliação automática de redações do enem utilizando técnicas de aprendizagem de máquina e processamento de linguagem natural. Anais do Computer on the Beach, pages 474–483.

Khosravi, H., Shum, S. B., Chen, G., Conati, C., Tsai, Y.-S., Kay, J., Knight, S., Martinez-Maldonado, R., Sadiq, S., and Gašević, D. (2022). Explainable artificial intelligence in education. Computers and Education: Artificial Intelligence, 3:100074.

Lima, F., Haendchen Filho, A., Prado, H., and Ferneda, E. (2018). Automatic evaluation of textual cohesion in essays. In 19th International Conference on Computational Linguistics and Intelligent Text Processing.

Marinho, J. C., Cordeiro, F., Anchiêta, R. T., and Moura, R. S. (2022). Automated essay scoring: An approach based on enem competencies. In Anais do XIX Encontro Nacional de Inteligência Artificial e Computacional, pages 49–60. SBC.

McNamara, D. S., Graesser, A. C., McCarthy, P. M., and Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press.

Oliveira, H., Ferreira Mello, R., Barreiros Rosa, B. A., Rakovic, M., Miranda, P., Cordeiro, T., Isotani, S., Bittencourt, I., and Gasevic, D. (2023). Towards explainable prediction of essay cohesion in portuguese and english. In LAK23: 13th International Learning Analytics and Knowledge Conference, pages 509–519.

Oliveira, H., Miranda, P., Isotani, S., Santos, J., Cordeiro, T., Bittencourt, I. I., and Mello, R. F. (2022). Estimando coesão textual em redações no contexto do enem utilizando modelos de aprendizado de máquina. In Anais do XXXIII Simpósio Brasileiro de Informática na Educação, pages 883–894. SBC.

Osakwe, I., Chen, G., Whitelock-Wainwright, A., Gašević, D., Cavalcanti, A. P., and Mello, R. F. (2022). Towards automated content analysis of educational feedback: A multi-language study. Computers and Education: Artificial Intelligence, 3:100059.

Palma, D. and Atkinson, J. (2018). Coherence-based automatic essay assessment. IEEE Intelligent Systems, 33(5):26–36.

Ratner, B. (2009). The correlation coefficient: Its values range between+ 1/1, or do they? Journal of targeting, measurement and analysis for marketing, 17(2):139–142.

Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., and Wesslén, A. (2012). Experimentation in software engineering. Springer Science & Business Media.