Efficiency Evaluation in Reading: a Natural Language Processing-Based approach

  • Túlio Sousa de Gois UFS
  • Raquel Meister Ko. Freitag UFS

Abstract


The cloze test, widely used due to its low cost and flexibility, makes it possible to assess reading comprehension by filling in gaps in texts, requiring the mobilization of diverse linguistic repertoires. However, traditional correction methods, based only on exact answers, limit the identification of nuances in student performance. This study proposes an automated evaluation model for the cloze test in Brazilian Portuguese, integrating orthographic (edit distance), grammatical (POS tagging) and semantic (similarity between embeddings) analyses. The integrated method demonstrated its effectiveness, achieving a high correlation with human evaluation (ρ = 0.832). The results indicate that the automated approach is robust, sensitive to variations in linguistic repertoire and suitable for educational contexts that require scalability.

References

Bird, S. and Loper, E. (2004). NLTK: The natural language toolkit. In Proceedings of the ACL Interactive Poster and Demonstration Sessions, pages 214–217, Barcelona, Spain. Association for Computational Linguistics.

Bormuth, J. R. (1968). Cloze test readability: Criterion reference scores. Journal of educational measurement, 5(3):189–196.

Cardoso, P. B., Menezes, K. V., Freitas, F. O., and Freitag, R. M. K. (2024). Eficiência na leitura: medidas de precisão e velocidade entre alunos do colégio de aplicação da universidade federal de sergipe. Revista Científica Sigma, 5(5):120–143.

Damerau, F. J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM, 7(3):171–176.

de Gois, T. S., Freitas, F. O., Tejada, J., and Freitag, R. M. K. (2024). Nlp and education: Using semantic similarity to evaluate filled gaps in a large-scale cloze test in the classroom. The Mental Lexicon, 19(1):90–99.

Freitas, F. O., dos Santos, G. E., and Freitag, R. M. K. (2025). The use of the cloze test in reading comprehension assessment in brazil: post-pandemic challenges. Cadernos de Linguística, 6(2):e787–e787.

Honnibal, M., Montani, I., Van Landeghem, S., and Boyd, A. (2020). spaCy: Industrial-strength Natural Language Processing in Python.

Kleijn, S., Pander Maat, H., and Sanders, T. (2019). Cloze testing for comprehension assessment: The hytec-cloze. Language Testing, 36(4):553–572.

Landis, J. R. and Koch, G. G. (1977). The measurement of observer agreement for categorical data. biometrics, pages 159–174.

Levenshtein, V. I. et al. (1966). Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10, pages 707–710. Soviet Union.

Mirault, J., Massol, S., and Grainger, J. (2021). An algorithm for analyzing cloze test results. Methods in Psychology, 5:100064.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E. Z., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. CoRR, abs/1912.01703.

Real, L., Fonseca, E., and Gonçalo Oliveira, H. (2020). The assin 2 shared task: a quick overview. In Computational Processing of the Portuguese Language: 14th International Conference, PROPOR 2020, Evora, Portugal, March 2–4, 2020, Proceedings 14, pages 406–412. Springer.

Santos, G. E. (2025). O preenchimento de lacunas de aspecto verbal em teste cloze: pistas de compreensão em leitura.

Santos, R., Rodrigues, J., Gomes, L., Silva, J., Branco, A., Cardoso, H. L., Osório, T. F., and Leite, B. (2024). Fostering the ecosystem of open neural encoders for portuguese with albertina pt-* family.

Souza, F., Nogueira, R., and Lotufo, R. (2020). BERTimbau: pretrained BERT models for Brazilian Portuguese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20-23 (to appear).

Van der Loo, M. (2014). The stringdist package for approximate string matching. The R Journal, 6(1):111–122.

Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., and Brew, J. (2019). Huggingface’s transformers: State-of-the-art natural language processing. CoRR, abs/1910.03771.
Published
2025-09-29
GOIS, Túlio Sousa de; FREITAG, Raquel Meister Ko.. Efficiency Evaluation in Reading: a Natural Language Processing-Based approach. In: BRAZILIAN SYMPOSIUM IN INFORMATION AND HUMAN LANGUAGE TECHNOLOGY (STIL), 16. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 161-169. DOI: https://doi.org/10.5753/stil.2025.37822.