Automated Essay Scoring: An approach based on ENEM competencies

  • Jeziel C. Marinho IFMA
  • Fábio Cordeiro UFPI
  • Rafael T. Anchiêta IFPI
  • Raimundo S. Moura UFPI

Resumo


Este trabalho apresenta estratégias para Avaliação Automática de Redações (AAR) escritas em português por meio de uma abordagem baseada na definição de features e modelos de AAR específicos para cada competência da matriz de referência do ENEM. Foram investigados métodos baseados em engenharia de features, embeddings e Redes Neurais Recorrentes. Apesar dos resultados obtidos serem melhores do que trabalhos relacionados, novos estudos devem ser conduzidos a fim de melhorar o desempenho dos modelos de AAR para a língua portuguesa.

Referências

Altman, D. (1990). Practical Statistics for Medical Research. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.

Amorim, E., Cançado, M., and Veloso, A. (2018). Automated essay scoring in the presence of biased ratings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 229-237, New Orleans, Louisiana. Association for Computational Linguistics.

Beigman Klebanov, B. and Madnani, N. (2020). Automated evaluation of writing - 50 years and counting. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7796-7810, Online. Association for Computational Linguistics.

Bird, S., Klein, E., and Loper, E. (2009). Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O'Reilly.

Caruana, R. (1998). Multitask Learning, pages 95-133. Springer US, Boston, MA.

Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 1724-1734, Doha, Qatar. Association for Computational Linguistics.

Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4):213-220.

da Silva Junior, J. A. (2021). Um avaliador automático de redações. Master's thesis, Universidade Federal do Espírito Santo.

de Almeida Júnior, C. R. C. (2017). Proposta de um sistema automático de avaliação de redações do enem, foco na competência 1: Demonstrar domínio da modalidade escrita formal da língua portuguesa. Master's thesis, Universidade Federal do Espírito Santo.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171-4186, Minneapolis, Minnesota. Association for Computational Linguistics.

Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning and Assessment, 5(1).

dos Santos Azevedo, F. (2019). Dicionário analógico da língua portuguesa: ideias afins/thesaurus. Obras de referência. Lexikon.

Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2):179-211.

Fonseca, E., Medeiros, I., Kamikawachi, D., and Bokan, A. (2018). Automatically grading brazilian student essays. In Proceedings of the 13th International Conference on Computational Processing of the Portuguese Language, pages 170-179, Canela, Brazil. Springer International Publishing.

Gonçalves, C. R. and Carvalho, M. T. N. d. (2010). Prática textual: ensino, produção e revisão. Scripta, 14(26):235-249.

Géron, A. (2019). Mãos à Obra: Aprendizado de Máquina com Scikit-Learn e Tensor-Flow. Alta Books.

Haendchen Filho, A., Prado, H., Ferneda, E., and Nau, J. (2018). An approach to evaluate adherence to the theme and the argumentative structure of essays. Procedia Computer Science, 126:788-797.

Hartmann, N. S., Fonseca, E. R., Shulby, C. D., Treviso, M. V., Rodrigues, J. S., and Aluísio, S. M. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. In Anais do XI Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, pages 122-131, Porto Alegre, RS, Brasil. SBC.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735-1780.

Honoré, A. (1979). Some simple measures of richness of vocabulary. Association for literary and linguistic computing bulletin, 7(2):172-177.

INEP (2019). Apostila de capacitação dos corretores de redação, Competência 1.

INEP (2020). A redação do ENEM, cartilha do participante.

INEP (2021). Enem 2020, resultados edição impressa, digital e ppl.

Izbicki, R. and dos Santos, T. M. (2020). Aprendizado de máquina: uma abordagem estatística.

Júnior, J. A. S. B. (2020). Avaliação automática de redação em língua portuguesa empregando redes neurais profundas. Master's thesis, Universidade Federal de Goiás, Goiânia.

Ke, Z. and Ng, V. (2019). Automated essay scoring: A survey of the state of the art. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 6300-6308. International Joint Conferences on Artificial Intelligence Organization.

Kingma, D. P. and Ba, J. (2015). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, San Diego, CA, USA.

Landauer, T., Foltz, P., and Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25:259-284.

Le, Q. and Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 1188-1196, Bejing, China. PMLR.

Luong, T., Pham, H., and Manning, C. D. (2015). Effective approaches to attentionbased neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1412-1421, Lisbon, Portugal. Association for Computational Linguistics.

Marinho, J., Anchiêta, R., and Moura, R. (2021). Essay-br: a brazilian corpus of essays. In Anais do III Dataset Showcase Workshop, pages 53-64, Porto Alegre, RS, Brasil. SBC.

Marinho, J. C., Anchiêta, R. T., and Moura, R. S. (2022). Essay-br: a brazilian corpus to automatic essay scoring task. Journal of Information and Data Management, 13:65- 76.

Page, E. B. (1966). The imminence of... grading essays by computer. The Phi Delta Kappan, 47(5):238-243.

Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227-2237, New Orleans, Louisiana. Association for Computational Linguistics.

Qi, P., Zhang, Y., Zhang, Y., Bolton, J., and Manning, C. D. (2020). Stanza: A python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 101-108, Online. Association for Computational Linguistics.

Sarkar, D. (2019). Text Analytics with Python: A Practitioner's Guide to Natural Language Processing. APress, 2nd edition.

Shermis, M. and Hamner, B. (2013). Contrasting state-of-the-art automated scoring of essays. Handbook of automated essay evaluation, pages 313-346.
Publicado
28/11/2022
Como Citar

Selecione um Formato
MARINHO, Jeziel C.; CORDEIRO, Fábio; ANCHIÊTA, Rafael T.; MOURA, Raimundo S.. Automated Essay Scoring: An approach based on ENEM competencies. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 19. , 2022, Campinas/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 49-60. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2022.227202.