BEATnIk: an algorithm for the automatic generation of educational description of movies

Vinicius Woloszyn; Guilherme M. Machado; José Palazzo; Horacio Saggion; Leandro Krug Wives

doi:10.5753/cbie.sbie.2017.1377

Vinicius Woloszyn Universidade Federal do Rio Grande do Sul (UFRGS)
Guilherme M. Machado Universidade Federal do Rio Grande do Sul (UFRGS)
José Palazzo Universidade Federal do Rio Grande do Sul (UFRGS)
Horacio Saggion Universitat Pompeu Fabra
Leandro Krug Wives Universidade Federal do Rio Grande do Sul (UFRGS)

DOI: https://doi.org/10.5753/cbie.sbie.2017.1377

Resumo

Teachers have increasingly employed different methods to enrich the learning of a subject in class, drive other assignments, and meet curriculum standards. One of such methods is the use of movies as an alternative educational experience to support class discussions. In this sense, websites such as TeachWithMovies 1, arise as a valuable support to the creation of lesson plans. In this website, each movie is described as a lesson plan targeting the learning of a subject. However, the creation of such lesson plan or even a simple educational description of the movie can demand much work and time, since the text describing the teaching plan must consider educational aspects of the movie. In this work, we propose BEATnIk (Biased Educational Automatic Text summarIzation), which is an unsupervised algorithm to automatically generate movies’ summaries. Such algorithm favors educational aspects from the text to generate a biased educational summary. The experiments conducted show that our approach statistically outperforms a baseline in precision, recall, and f-score.

Palavras-chave: movies, education, automatic summary

Referências

Castro, M.C., Werneck, V., and Gouvea, N. (2016). Ensino de Matemática Através de Algoritmos Utilizando Jogos para Alunos do Ensino Fundamental II. page 1039.

Erkan, G. and Radev, D.R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22:457–479.

Ganesan, K., Zhai, C., and Han, J. (2010). Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 340–348. Association for Computational Linguistics.

Giraffa, L., Muller, L., and Moraes, M.C. (2015). Ensino da Programação apoiado por um ambiente virtual e exercícios associados ao cotidiano dos alunos: compartilhando alternativas e lições aprendidas. In Anais dos Workshops do Congresso Brasileiro de Informática na Educação, volume 4, page 1330.

Lin, C.-Y. (2004). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop, volume 8. Barcelona, Spain.

McAuley, J.J. and Leskovec, J. (2013). From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise Through Online Reviews. In Proceedings of the 22nd International Conference on World Wide Web, WWW'13, pages 897–908, New York, NY, USA. ACM.

Mihalcea, R. and Tarau, P. (2004). Textrank: Bringing order into texts. Association for Computational Linguistics.

Oliveira, M.V., Rodrigues, L.C., and Queiroga, A. (2016). Material didático lúdico: uso da ferramenta Scratch para auxílio no aprendizado da lógica da programação. page 359.

Page, L., Brin, S., Motwani, R., and Winograd, T. (1999). The pagerank citation ranking: bringing order to the web.

Poibeau, T., Saggion, H., Piskorski, J., and Yangarber, R. (2012). Multi-source, Multilingual Information Extraction and Summarization. Springer Science & Business Media.

Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., Celebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., et al. (2004). Mead-a platform for multidocument multilingual text summarization.

Ramos, A.M.S., Woloszyn, V., and Wives, L.K. (2017). An experimental analysis of feature selection and similarity assessment for textual summarization. In Colombian Conference on Computing, pages 146–155. Springer.

Řehůřek, R. and Sojka, P. (2010). Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45–50, Valletta, Malta. ELRA. [link].

Saggion, H. and Poibeau, T. (2013). Automatic text summarization: Past, present and future. In Multi-source, multilingual information extraction and summarization, pages 3–21. Springer.

Wan, X. (2013). Co-regression for cross-language review rating prediction. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 526–531, Sofia, Bulgaria. Association for Computational Linguistics.

Woloszyn, V., dos Santos, H.D., Wives, L.K., and Becker, K. (2017). Mrr: an unsupervised algorithm to rank reviews by relevance. In Proceedings of the International Conference on Web Intelligence, pages 877–883. ACM.

Wu, J., Xu, B., and Li, S. (2011). An unsupervised approach to rank product reviews. In Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on, volume 3, pages 1769–1772. IEEE.

Xiong, W. and Litman, D. (2011). Automatically predicting peer-review helpfulness. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 502–507, Portland, Oregon, USA. Association for Computational Linguistics.

Yang, Y., Yan, Y., Qiu, M., and Bao, F. (2015). Semantic analysis and helpfulness prediction of text for online product reviews. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, pages 38–44, Beijing, China. Association for Computational Linguistics.

Zeng, Y.-C. and Wu, S.-H. (2013). Modeling the helpful opinion mining of online consumer reviews as a classification problem. In Proceedings of the IJCNLP 2013 Workshop on NLP for Social Media (SocialNLP), pages 29–35, Nagoya, Japan. Asian Federation of Natural Language Processing.