Unsupervised Heterogeneous Graph Neural Network for Hit Song Prediction through One Class Learning

Angelo Cesar Mendes da Silva; Marcos Paulo Silva Gôlo; Ricardo Marcondes Marcacini

doi:10.5753/kdmile.2022.227954

Angelo Cesar Mendes da Silva Universidade de São Paulo
Marcos Paulo Silva Gôlo Universidade de São Paulo
Ricardo Marcondes Marcacini Universidade de São Paulo

DOI: https://doi.org/10.5753/kdmile.2022.227954

Resumo

Although the concept of success is subjective, it can be related to the popularity and interest of users. Measuring the success of a song in advance allows for offering information of great interest to the music market. Hit song prediction is an existing task in Music Information Retrieval that explores approaches for measuring music success based on features. Musical data is intrinsically multimodal, where features from different sources have complementary semantic information. Therefore, structuring musical data and building a unique space that embeds multiple features is a challenge in musical data representation. Using heterogeneous graphs to structure multimodal data is a resource for exploring the intrinsic semantic relationship between features. In this sense, this work proposes to structure musical features over heterogeneous graphs and learn a new graph-based multimodal representation for songs using an unsupervised graph neural network to handle the hit song prediction task. We formulated the hit song prediction task as a one-class learning problem to mitigate the non-hit song gaps and highlight the hit song as the interest class. We measure the performance of representations based on lyrics and artist features and present promising results using our learned representations that outperform other strategies for representing musical data.

Palavras-chave: graph-based representation, heterogeneous graph, music representation, one class hit song prediction

Referências

Ali, I. and Melton, A. Graph-based semantic learning, representation and growth from text: A systematic review. In 2019 IEEE 13th ICSC. IEEE, Newport Beach, CA, USA, pp. 118–123, 2019.

Bengio, Y., Courville, A., and Vincent, P. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35 (8): 1798–1828, 2013.

Bertoni, A. A. et al. Avaliação de características e previsão de sucesso de canções populares brasileiras por meio de aprendizado de máquina. M.S. thesis, Universidade Federal de Goiás, 2021.

Chen, W., Keast, J., Moody, J., Moriarty, C., Villalobos, F., Winter, V., Zhang, X., Lyu, X., Freeman, E., Wang, J., Kai, S., and Kinnaird, K. M. Data usage in mir: History & future recommendations. In International Society for Music Information Retrieval Conference. ISMIR, Delft, The Netherlands, pp. 25–32, 2019.

da Silva, A. C. M., do Carmo, P. R. V., Marcacini, R. M., and Silva, D. F. Instance selection for music genre classification using heterogeneous networks. Simpósio Brasileiro de Computação Musical vol. 18, pp. 11–18, 2021.

da Silva, A. C. M., Silva, D. F., and Marcacini, R. M. Multimodal representation learning over heterogeneous networks for tag-based music retrieval. Expert Systems with Applications vol. 207, pp. 1–9, 2022.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. In 2019 NAACL. ACL, Minnesota, pp. 4171–4186, 2019.

Emmert-Streib, F. and Dehmer, M. Taxonomy of machine learning paradigms: A data-centric perspective. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery vol. e1470, pp. 25, 2022.

Ge, Y., Wu, J., and Sun, Y. Popularity prediction of music based on factor extraction and model blending. In 2020 2nd ICEMME. IEEE, Chongqing, China, pp. 1062–1065, 2020.

Gôlo, M., Caravanti, M., Rossi, R., Rezende, S., Nogueira, B., and Marcacini, R. Learning textual representations from multiple modalities to detect fake news through one-class learning. In Proc. of the Brazilian Symposium on Multimedia and the Web. ACM, Belo Horizonte, Brazil, pp. 197–204, 2021.

Gôlo, M. P., Araújo, A. F., Rossi, R. G., and Marcacini, R. M. Detecting relevant app reviews for software evolution and maintenance through multimodal one-class learning. Inf. Software Technology vol. 151, pp. 1–12, 2022.

Hamilton, W., Ying, Z., and Leskovec, J. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc., LONG BEACH, CA, USA, pp. 1–11, 2017.

Herremans, D., Martens, D., and Sörensen, K. Dance hit song prediction. New Music Res. 43 (3): 291–302, 2014.

Hiller, R. S. and Walter, J. M. The rise of streaming music and implications for music production. Review of Network Economics 16 (4): 351–385, 2017.

IFPI. Ifpi issues global music report 2022. https://globalmusicreport.ifpi.org/, 2022. Accessed: 09-12-2022.

Karydis, I., Gkiokas, A., Katsouros, V., and Iliadis, L. Musical track popularity mining dataset: Extension & experimentation. Neurocomputing vol. 280, pp. 76–85, 2018.

Kim, J., Urbano, J., Liem, C., and Hanjalic, A. One deep music representation to rule them all? a comparative analysis of different representation learning strategies. Neural Computing and Applications 32 (4): 1067–1093, 2020.

Knees, P. and Schedl, M. A survey of music similarity and recommendation from music context data. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 10 (1): 1–21, 2013.

Korzeniowski, F., Oramas, S., and Gouyon, F. Artist similarity with graph neural networks. In Proc. of International Society for Music Information Retrieval. ISMIR, Online, 2021.

Krohn-Grimberghe, L. How streaming technology impacts music production and consumption. In Classical Concert Studies. Routledge, New York, EUA, 30, pp. 296–308, 2021.

Martín-Gutiérrez, D., Peñaloza, G. H., Belmonte-Hernández, A., and García, F. Á. A multimodal end-to-end deep learning architecture for music popularity prediction. IEEE Access vol. 8, pp. 39361–39374, 2020.

Melo, D. d. F. P., Fadigas, I. d. S., and Pereira, H. B. d. B. Graph-based feature extraction: A new proposal to study the classification of music signals outside the time-frequency domain. PLOS ONE 15 (11): 1–26, 11, 2020.

Pareek, P., Shankar, P., Pathak, M. P., and Sakariya, M. N. Predicting music popularity using machine learning algorithm and music metrics available in spotify. Center for Development Economic Studies 9 (11): 10–19, 2022.

Sawant, S. S. and Prabukumar, M. A review on graph-based semi-supervised learning methods for hyperspectral image classification. The Egyptian Journal of Remote Sensing and Space Science 23 (2): 243–248, 2020.

Shi, C. Heterogeneous graph neural networks. In Graph Neural Networks: Foundations, Frontiers, and Applications, L. Wu, P. Cui, J. Pei, and L. Zhao (Eds.). Springer Singapore, Singapore, 16, pp. 351–370, 2022.

Silva, A. C. M., Silva, D. F., and Marcacini, R. M. Heterogeneous graph neural network for music emotion recognition. In Proc. of International Society for Music Information Retrieval. ISMIR, Bengaluru, India, 2022.

Simonetta, F., Ntalampiras, S., and Avanzini, F. Multimodal music information processing and retrieval: Survey and future challenges. In 2019 International Workshop on Multilayer Music Representation and Processing (MMRP). IEEE, Milan, Italy, pp. 10–18, 2019.

Singhi, A. Lyrics matter: Using lyrics to solve music information retrieval tasks. M.S. thesis, Uni. of Waterloo, 2015.

Singhi, A. and Brown, D. G. Can song lyrics predict hits. In Proceedings of the 11th International Symposium on Computer Music Multidisciplinary Research. University of Waterloo, Plymouth, UK, pp. 457–471, 2015.

Song, Y., Dixon, S., and Pearce, M. A survey of music recommendation systems and future perspectives. In 9th international symposium on computer music modeling and retrieval. Vol. 4. Springer, London, pp. 395–410, 2012.

Tax, D. and Duin, R. Support vector data description. Machine Learning 54 (1): 45–66, 2004.

Tax, D. M. J. One-class classification: Concept learning in the absence of counter-examples. Ph.D. thesis, Technische Universiteit Delft, 2001.

Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., and Yu, P. S. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems 32 (1): 4–24, 2021.

Xia, F., Sun, K., Yu, S., Aziz, A., Wan, L., Pan, S., and Liu, H. Graph learning: A survey. IEEE Transactions on Artificial Intelligence 2 (02): 109–127, apr, 2021.

Yang, C., Xiao, Y., Zhang, Y., Sun, Y., and Han, J. Heterogeneous network representation learning: A unified framework with survey and benchmark. Transactions on Knowledge and Data Engineering vol. PP, pp. 1–1, 2020.

Zangerle, E., Vötter, M., Huber, R., and Yang, Y.-H. Hit song prediction: Leveraging low-and high-level audio features. In ISMIR. ISMIR, Delft, Netherlands, pp. 319–326, 2019.