Um Método para Avaliação Automática da Qualidade de Recursos Educacionais Abertos Usando Deep Learning

Murilo Gleyson Gazzola

doi:10.5753/cbie.sbie.2017.1477

Murilo Gleyson Gazzola Universidade de São Paulo (USP)

DOI: https://doi.org/10.5753/cbie.sbie.2017.1477

Resumo

Recursos Educacionais Abertos (REAs) são documentos abertamente licenciados e usados para fins de ensino, aprendizagem e pesquisa. Abrangem cursos completos, livros didáticos, vídeos, softwares e quaisquer outras ferramentas, materiais ou técnicas para apoiar o acesso ao conhecimento. A principal dificuldade, porém, é garantir a qualidade desses recursos educacionais armazenados em repositórios on-line. Para preencher esta lacuna, foi criado um método usando redes neurais profundas, especificamente, uma Rede Neural Recorrente (RNN) para avaliação automatizada da qualidade de recursos educacionais abertos, sendo comparado com uma Máquina de Vetores de Suporte (SVM) e suas variações. A metodologia de pesquisa utilizada foi a criação de uma arquitetura para rede neural, a criação de um cenário controlado, e a comparação com os principais trabalhos que realizam avaliação automatizada de REAs.

Palavras-chave: Recursos Educacionais Abertos, REAs, Deep Learning, Rede Neural Recorrente, RNN

Referências

Ahmed, F. and Fuge, M. (2017). Capturing winning ideas in online design communities. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW’17, pages 1675–1687, New York, NY, USA. ACM.

Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M. W., Pfau, D., Schaul, T., and de Freitas, N. (2016). Learning to learn by gradient descent by gradient descent. In Advances in Neural Information Processing Systems, pages 3981–3989.

Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.

Bengio, Y. et al. (2009). Learning deep architectures for ai. Foundations and trends in Machine Learning, 2(1):1–127.

Bethard, S., Wetzer, P., Butcher, K., Martin, J. H., and Sumner, T. (2009). Automatically characterizing resource quality for educational digital libraries. In Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, pages 221–230. ACM.

Bowman, S. R., Angeli, G., Potts, C., and Manning, C. D. (2015). A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326.

Cechinel, C., Sanchez-Alonso, S., and Garcia-Barriocanal, E. (2011). Statistical profiles of highly-rated learning objects. Comput. Educ., 57(1):1255–1269.

Cho, K., van Merriënboer, B., Bahdanau, D., and Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. In Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pages 103–111. Association for Computational Linguistics.

Custard, M. and Sumner, T. (2005). Using machine learning to support quality judgments. D-Lib Magazine, 11(10):1082–9873.

Dalip, D. H., Gonçalves, M. A., Cristo, M., and Calado, P. (2011). Automatic assessment of document quality in web collaborative digital libraries. Journal of Data and Information Quality (JDIQ), 2(3):14.

De Boer, P.-T., Kroese, D. P., Mannor, S., and Rubinstein, R. Y. (2005). A tutorial on the cross-entropy method. Annals of Operations Research, 134(1):19–67.

Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. (2015). Long-term recurrent convolutional networks for visual recognition and description. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Draelos, T. J., Miner, N. E., Lamb, C. C., Cox, J. A., Vineyard, C. M., Carlson, K. D., Severa, W. M., James, C. D., and Aimone, J. B. (2017). Neurogenesis deep learning: Extending deep networks to accommodate new classes. In Neural Networks (IJCNN), 2017 International Joint Conference on, pages 526–533. IEEE.

Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of machine learning research, 3(Mar):1289–1305.

Gazzola, M. G. (2015). Uma arquitetura para mecanismos de buscas na web usando integração de esquemas e padrões de metadados heterogêneos de recursos educacionais abertos em repositórios dispersos. In Dissertação de Mestrado - Instituto de Ciências Matemáticas e de Computação. Acesso em: 2017-04-17.

Gazzola, M. G., Ciferri, C. D., and Gimenes, I. M. (2014). Seeoer: Uma arquitetura para mecanismo de busca na web por recursos educacionais abertos. In Anais do Simpósio Brasileiro de Informática na Educação, volume 25, page 1013.

Gimenes, I. M., Barroca, L., and Feltrim, V. D. (2012). Tendências na educação a distância e educação aberta na computação. CSBC.

Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850.

Graves, A., Mohamed, A.-r., and Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pages 6645–6649. IEEE.

Graves, A. and Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in Neural Information Processing Systems, pages 545–552.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8):1735–1780.

Jakkula, V. (2006). Tutorial on support vector machine (svm). School of EECS, Washington State University, 37.

Jozefowicz, R., Zaremba, W., and Sutskever, I. (2015). An empirical exploration of recurrent network architectures. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 2342–2350.

Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Leary, H., Recker, M., Walker, A., Wetzler, P., Sumner, T., and Martin, J. (2011). Automating open educational resources assessments: a machine learning generalization study. In Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, pages 283–286. ACM.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521(7553):436–444.

Matsune, H. (2007). Using headings to improve visual readability. [link]. Acessado em: 2017-09-04.

Medelyan, O., Milne, D., Legg, C., and Witten, I. H. (2009). Mining meaning from wikipedia. International Journal of Human-Computer Studies, 67(9):716–754.

Miao, F., Mishra, S., and McGreal, R. (2016). Open educational resources: policy, costs, transformation. UNESCO Publishing.

Mikolov, T. (2012). Statistical Language Models Based on Neural Networks. PhD thesis, Brno University of Technology.

Moise, G., Vladoiu, M., and Constantinescu, Z. (2011). Maseco – a multi-agent system for evaluation and classification of ers and ocw based on quality criteria.

Pascanu, R., Mikolov, T., and Bengio, Y. (2013). On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013, pages 1310–1318.

Ramos, J. et al. (2003). Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, volume 242, pages 133–142.

Rocktäschel, T., Grefenstette, E., Hermann, K. M., Kočiský, T., and Blunsom, P. (2015). Reasoning about entailment with neural attention. arXiv preprint arXiv:1509.06664.

Scikit-learn (2017). Scikit-learn - machine learning in python. [link]. Acessado em: 2017-09-04.

Smith, M. S. and Casserly, C. M. (2006). The promise of open educational resources. Change: The Magazine of Higher Learning, 38(5):8–17.

Sutskever, I., Martens, J., and Hinton, G. E. (2011). Generating text with recurrent neural networks. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 1017–1024.

Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, pages 3104–3112.

UNESCO. (2002). Forum on the impact of open courseware for higher education in developing countries: final report.

Wan, J., Wang, D., Hoi, S. C. H., Wu, P., Zhu, J., Zhang, Y., and Li, J. (2014). Deep learning for content-based image retrieval: A comprehensive study. In Proceedings of the 22nd ACM international conference on Multimedia, pages 157–166. ACM.

Wetzler, P., Bethard, S., Leary, H., Butcher, K., Bahreini, S. D., Zhao, J., Martin, J. H., and Sumner, T. (2013). Characterizing and predicting the multifaceted nature of quality in educational web resources. ACM Transactions on Interactive Intelligent Systems (TiiS), 3(3):15.

Wiley, D., Bliss, T., and McEwen, M. (2014). Open educational resources: A review of the literature. pages 781–789.

Wu, C.-Y., Ahmed, A., Beutel, A., Smola, A. J., and Jing, H. (2017). Recurrent recommender networks. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM’17, pages 495–503, New York, NY, USA. ACM.