An interplay between genre and emotion prediction in music: a study in the Emotify dataset
Resumo
Automatic classification problems are common in the music information retrieval domain. Among those we can find the automatic identification of music genre and music mood as frequently approached problems. The labels related to genre and mood are both generated by humans, according to subjective experiences related to each individual’s growth and development, that is, each person attributes different meanings to genre and mood labels. However, because both genre and mood arise from a similar process related to the social surroundings of an individual, we hypothesize that they are somehow related. In this study, we present experiments performed in the Emotify dataset, which comprises audio data and genre and mood-related tags for several pieces. We show that we can predict genre from audio data with a high accuracy; however, we consistently obtained low accuracy to predict mood tags. Additionally, we tried to use mood tags to predict genre, and also obtained a low accuracy. An analysis of the feature space reveals that our features are more related to genre than to mood, which explains the results from a linear algebra viewpoint. However, we still cannot find a music-related explanation to this difference.
Referências
E. Hanslick and G. Payzant, On the musically beautiful: acontribution towards the revision of the aesthetics of music / Eduard Hanslick ; translated and edited by GeoffreyPayzant. Hackett Pub. Co Indianapolis, Ind, 1986.
A. Damasio and G. B. Carvalho, “The nature of feelings: evolutionary and neurobiological origins,” Nature Reviews Neuroscience, vol. 14, no. 2, pp. 143–152, Feb.2013. [Online]. Available: http://www.nature.com/articles/nrn3403
J. C. Lena, Banding Together: How Communities Create Genres in Popular Music. Princeton University Press, 2012.
G. Tzanetakis and P. Cook, “Musical genre classification of audio signals,” IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 293–302, 2002.
J. H. Foleiss and T. F. Tavares, “Texture selection for auto-matic music genre classification,” Applied Soft Computing, p. 106127, feb 2020.
T. Li and M. Ogihara, “Content-based music similarity search and emotion detection,” in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, 2004, pp. V–705.
A. Aljanaki, F. Wiering, and R. Veltkamp, “Collecting an-notations for induced musical emotion via online game with a purpose emotify,” UU BETA ICS Departement Informatica, Tech. Rep., 01 2014.
A. Dalmora and T. F. Tavares, “Identifying narrative contexts in brazilian popular music lyrics using sparse topic models: A comparison between human-based and machine-based classification,” in XVII Brazilian Symposium on Computer Music, São João del Rei, MG, Brazil, sep 2019.
M. Zentner, D. Grandjean, and K. R. Scherer, “Emotions evoked by the sound of music: Characterization, classification, and measurement.” Emotion, vol. 8, no. 4, pp. 494–521, 2008. [Online]. Available: http://doi.apa.org/getdoi.cfm?doi=10.1037/1528-3542.8.4.494
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015.