Portuguese Automatic Short Answer Grading
Resumo
Automatic Short Answer Grading is the study field that addresses the assessment of students’ answers to questions in natural language. Besides length, it differs from automatic essay grading by focusing on the evaluation of content instead of answer’s style. The grading of the answers is generally seen as a typical classification supervised learning. Many works have been recently developed, but most of them deal with data in the English language. In this paper, we present a new Portuguese dataset and system for automatic short answer grading. The data was collected with the participation of 13 teachers, 12 undergraduate students and 245 elementary school students. Results achieved 69% accuracy in four-class classification and 85% on binary classification.
Referências
Burrows, S., Gurevych, I., and Stein, B. (2015). The eras and trends of automatic short answer grading. International Journal of Artificial Intelligence in Education, pages 60–117.
Chen, T. and Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794, New York, NY, USA. ACM.
Dzikovska, M., Nielsen, R., Brew, C., Leacock, C., Giampiccolo, D., Bentivogli, L., Clark, P., Dagan, I., and Dang, H. T. (2013). SemEval-2013 Task 7: The Joint Student Response Analysis and 8th Recognizing Textual Entailment Challenge. Seventh International Workshop on Semantic Evaluation, pages 263–274.
Figueira, A. d. S., da Silva, A. G., de Melo, B. M., Lino, A. D. P., Lobato, F. R. L., and Favero, E. L. (2013). Módulo de Avaliação Automática de Questões Discursivas no Ambiente Virtual de Aprendizagem LabSQL. Iberian Conference on Information Systems and Technologies (CISTI), pages 1–5. In Portuguese.
Flores, E. M., Rigo, S. J., and Barbosa, J. L. V. (2014). Um modelo para avaliação automática de respostas textuais com uso de regras linguísticas. Brazilian Symposium on Computers in Education (SBIE), 25(1):1153. In Portuguese.
Heilman, M. and Madnani, N. (2013). ETS: Domain Adaptation and Stacking for Short Answer Scoring. Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval), 2:275–279.
Higgins, D. et al. (2014). Is getting the right answer just about choosing the right words? The role of syntactically-informed features in short answer scoring. arXiv:1403.0801v2 [cs.CL].
Liu, O. L., Rios, J. A., Heilman, M., Gerard, L., and Linn, M. C. (2016). Validation of automated scoring of science assessments. Journal of Research in Science Teaching, 53(2):215–233.
Magooda, A., Zahran, M. A., Rashwan, M., Raafat, H., and Fayek, M. B. (2016). Vector Based Techniques for Short Answer Grading. International Florida Artificial Intelligence Research Society Conference, pages 238–243.
Mohler, M. and Mihalcea, R. (2009). Text-to-text semantic similarity for automatic short answer grading. Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics on - EACL ’09, pages 567–575.
Page, E. B. (1966). The imminence of... grading essays by computer. The Phi Delta Kappan, 47(5):238–243.
Passero, G., Haendchen Filho, A., and Dazzi, R. (2016). Avaliação do uso de métodos baseados em LSA e WordNet para correção de questões discursivas. In Brazilian Symposium on Computers in Education (SBIE), volume 27, page 1136. In Portuguese.
Ramachandran, L., Cheng, J., and Foltz, P. (2015). Identifying Patterns For Short Answer Scoring Using Graph-based Lexico-Semantic Text Matching. Workshop on Innovative Use of NLP for Building Educational Applications, 10:97–106.
Riordan, B., Horbach, A., Cahill, A., Zesch, T., and Lee, C. M. (2017). Investigating neural architectures for short answer scoring. BEA'17, pages 159–168.
Roy, S., Bhatt, H. S., and Narahari, Y. (2016). An Iterative Transfer Learning Based Ensemble Technique for Automatic Short Answer Grading. 285:1622–1623.
Salton, G. D., Carniel, C. A., and Mello, B. A. D. (2013). Regras sintáticas livres de contexto na correção automática de Unidades de Leitura. pages 217–222. In Portuguese.
Santos, J. C. A., Ribeiro, T., Favero, E., and Queiroz, J. (2012). Aplicação de um método LSA na avaliação automática de respostas discursivas. Revista Brasileira de Informática na Educação, 0(0):10–19. In Portuguese.
Sultan, M. A., Salazar, C., and Sumner, T. (2016). Fast and Easy Short Answer Grading with High Accuracy. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1070–1075.
Vijaymeena, M. and Kavitha, K. (2016). A survey on similarity measures in text mining. Machine Learning and Applications: An International Journal, 3(2):19–28.
Vilela, R. F., Valle, P. H. D., Muniz, R. J., Lima, W. A., Inocêncio, A. C. G., and Junior, P. A. P. (2012). SCATeDi : Sistema Inteligente para Avaliação de Desempenho Escolar em Avaliações Discursivas. Workshop de Informática na Escola, (1984):11. In Portuguese.
Zhang, C., Liu, C., Zhang, X., and Almpanidis, G. (2017). An up-to-date comparison of state-of-the-art classification algorithms. Expert Systems with Applications, 82:128–150.
Zhang, Y., Shah, R., and Chi, M. (2016). Deep Learning + Student Modeling + Clustering: a Recipe for Effective Automatic Answer Grading. Proceedings of the 9th International Conference on Educational Data Mining, pages 562–567.
