On the Use of Synopsis-based Features for Film Genre Classification
Resumo
Technological advancements and the interest of companies that operate in digital environments have made the categorization of mediatic products increasingly popular. This is often a multi-label scenario, where an item may be labeled with many categories. Most of the literature approach film genre classification as a mono-label task, usually relying on audio-visual features. In this paper we explore the use of text-based features extracted from film synopses for multi-label film genre classification. We experimented with 19 feature extraction approaches combined with 4 multi-label classifiers. Our experimental results show f1-scores of up to 54.8%, which are significantly higher than other similar studies presented in the literature.
Referências
Austin, A., Moore, E., Gupta, U., and Chordia, P. (2010). Characterization of movie genre based on music score. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on, pages 421–424. IEEE.
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.
Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., and Aluísio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. In Proceedings of Symposium in Information and Human Language Technology, pages 122–131. Sociedade Brasileira de Computação.
Huang, Y.-F. and Wang, S.-H. (2012). Movie genre classification using svm with audio and video features. In International Conference on Active Media Technology, pages 1–10. Springer.
Ivasic-Kos, M., Pobar, M., and Ipsic, I. (2015). Automatic movie posters classification into genres. In ICT Innovations 2014, pages 319–328. Springer.
Le, Q. and Mikolov, T. (2014). Distributed representations of sentences and documents. In International Conference on Machine Learning, pages 1188–1196.
Lee, Y.-B. and Myaeng, S. H. (2002). Text genre classification with genre-revealing and subject-revealing features. In Proceedings of the 25th annual international ACM
SIGIR conference on Research and development in information retrieval, pages 145–150. ACM.
Ling, W., Dyer, C., Black, A. W., and Trancoso, I. (2015). Two/too simple adaptations of word2vec for syntax problems. In Proceedings of the 2015 Conference of the
North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1299–1304.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Pennington, J., Socher, R., and Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543.
Rasheed, Z., Sheikh, Y., and Shah, M. (2005). On the use of computable features for film classification. IEEE Transactions on Circuits and Systems for Video Technology, 15(1):52–64.
Sugano, M., Isaksson, R., Nakajima, Y., and Yanagihara, H. (2003). Shot genre classification using compressed audio-visual features. In Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on, volume 2, pages II–17. IEEE.
Zhou, H., Hermans, T., Karandikar, A. V., and Rehg, J. M. (2010). Movie genre classification via scene categorization. In Proceedings of the 18th ACM international conference on Multimedia, pages 747–750. ACM.