É possível descrever episódios de séries de televisão a partir de comentários online?
Resumo
Por causa do uso onipresente da Internet e da Web 2.0 na sociedade atual, é fácil encontrar grupos ou comunidades de pessoas que discutem sobre os mais variados assuntos. Neste artigo, tentamos responder a questão sobre se, mesmo quando nada é explicitamente conhecido sobre a entidade referida na discussão, é possível formular uma ideia geral e resumida de suas características ao ler comentários sobre ela. Para estudar esse problema, analisamos o potencial que comentários têm para descrever séries de televisão, e realizamos uma tarefa de classificação de comentários a fim de identificar a qual série e episódio ele está associado. Essa tarefa de classificação serve como base de um método que seleciona comentários de alto valor descritivo para os episódios e séries. Resultados revelam que um pequeno conjunto de comentários conseguem descrever seus episódios e, quando tomados em conjunto, a série como um todo.
Referências
Bickel, S. and Scheffer, T. (2004). Multi-View Clustering. In Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM ’04.
Cheng, J., Danescu-Niculescu-Mizil, C., and Leskovec, J. (2015). Antisocial Behavior in Online Discussion Communities.
Choi, D., Han, J., Chung, T., Ahn, Y.-Y., Chun, B.-G., and Kwon, T. T. (2015). Characterizing Conversation Patterns in Reddit. In Proceedings of the 2015 ACM on Conference on Online Social Networks - COSN ’15, New York, New York, USA.
Chvatal, V. (1979). A greedy heuristic for the set-covering problem. Math. Oper. Res., 4(3):233–235.
Friedman, N., Geiger, D., and Goldszmidt, M. (1997). Bayesian network classifiers. Mach. Learn., 29(2-3):131–163.
Gambhir, M. and Gupta, V. (2017). Recent automatic text summarization techniques: a survey. Artificial Intelligence Review, 47(1):1–66.
Ganesan, K., Zhai, C., and Han, J. (2010). Opinosis : A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions. In Proceedings of the 23rd International Conference on Computational Linguistics, number August in COLING ’10, pages 340–348, Stroudsburg, PA, USA. Association for Computational Linguistics.
Ganesan, K., Zhai, C., and Viegas, E. (2012). Micropinion generation. In Proceedings of the 21st international conference on World Wide Web - WWW ’12, page 869, New York, New York, USA. ACM Press.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The weka data mining software: An update. SIGKDD Explor. Newsl., 11(1):10–18.
Hsu, C.-F., Khabiri, E., and Caverlee, J. (2009). Ranking comments on the social web. In Computational Science and Engineering, 2009. CSE’09. International Conference on, volume 4, pages 90–97. IEEE.
Ji, Y. and Eisenstein, J. (2014). Representation learning for text-level discourse parsing. In Proceedings of the Association for Computational Linguistics (ACL), Baltimore, MD.
Khabiri, E., Caverlee, J., and Hsu, C.-F. (2011). Summarizing User-Contributed Comments. In ICWSM.
Kleinberg, J. (2002). Bursty and hierarchical structure in streams. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, New York, NY, USA.
Kågebäck, M., Mogren, O., Tahmasebi, N., and Dubhashi, D. (2014). Extractive Summarization using Continuous Vector Space Models. In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC), pages 31–39, Gothenburg, Sweden. Association for Computational Linguistics.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25. Curran Associates, Inc.
Liu, C.-Y., Chen, M.-S., and Tseng, C.-Y. (2015a). IncreSTS: Towards Real-Time Incremental Short Text Summarization on Comment Streams from Social Network Services. IEEE Transactions on Knowledge and Data Engineering, 27(11):2986–3000.
Liu, F., Flanigan, J., Thomson, S., Sadeh, N. M., and Smith, N. A. (2015b). Toward Abstractive Summarization Using Semantic Representations. In Mihalcea, R., Chai, J. Y., and Sarkar, A., editors, HLT-NAACL, pages 1077–1086. The Association for Computational Linguistics.
Moratanch, N. and Chitrakala, S. (2016). A survey on abstractive text summarization. In 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), pages 1–7. IEEE.
Perozzi, B., Al-Rfou, R., and Skiena, S. (2014). DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14.
Potthast, M. and Becker, S. (2010). Opinion Summarization of Web Comments. pages 668–669.
Siersdorfer, S., Chelaru, S., Pedro, J. S., Altingovde, I. S., and Nejdl, W. (2014). Analyzing and Mining Comments and Comment Ratings on the Social Web. ACM Trans. Web.
van der Maaten, L. and Hinton, G. (2008). Visualizing high-dimensional data using t-sne. Journal of Machine Learning Research, 9:2579–2605.
von Ahn, L. and Dabbish, L. (2004). Labeling Images with a Computer Game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’04.
Yang, C., Liu, Z., Zhao, D., Sun, M., and Chang, E. Y. (2015). Network Representation Learning with Rich Text Information. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15.
Yang, Z., Cai, K., Tang, J., Zhang, L., Su, Z., and Li, J. (2011). Social context summarization. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR ’11, page 255, New York, New York, USA. ACM Press.
yew Lin, C. (2004). Rouge: a package for automatic evaluation of summaries. pages 25–26.