Comparing Meta-Classifiers for Automatic Music Genre Classification

Vítor Shinohara; Juliano Foleiss; Tiago Tavares

doi:10.5753/sbcm.2019.10434

Vítor Shinohara University of Campinas
Juliano Foleiss Federal Technological University of Paraná
Tiago Tavares State University of Campinas

DOI: https://doi.org/10.5753/sbcm.2019.10434

Resumo

Automatic music genre classification is the problem of associating mutually-exclusive labels to audio tracks. This process fosters the organization of collections and facilitates searching and marketing music. One approach for automatic music genre classification is to use diverse vector representations for each track, and then classify them individually. After that, a majority voting system can be used to infer a single label to the whole track. In this work, we evaluated the impact of changing the majority voting system to a meta-classifier. The classification results with the meta-classifier showed statistically significant improvements when related to the majority-voting classifier. This indicates that the higher-level information used by the meta-classifier might be relevant for automatic music genre classification.

Palavras-chave: Music Information Retrieval

Referências

G. Tzanetakis and P. Cook. Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5):293–302, July 2002.

CBS Interactive. Last.fm, 2002. Accessed: 2019-05-01.

J. Lee and J. Nam. Multi-level and multi-scale feature aggregation using pretrained convolutional neural networks for music auto-tagging. IEEE Signal Processing Letters, 24(8):1208–1212, Aug 2017.

Tiago Fernandes Tavares and Juliano Foleiss. Automatic music genre classification in small and ethnic datasets. In 13th International Symposium on Computer Music Multi- disciplinary Research (CMMR), sep 2017.

Mathieu Lagrange, Grégoire Lafay, Boris Defreville, and Jean-Julien Aucouturier. The bag-of-frames approach: a not so sufficient model for urban soundscapes. Journal of the Acoustical Society of America, 138(5):487–492, October 2015.

Mikael Henaff, Kevin Jarrett, Koray Kavukcuoglu, and Yann LeCun. Unsupervised learning of sparse features for scalable audio classification. In 12th Proceedings of the International Conference on Music Information Retrieval, 2011.

Philippe Hamel, Simon Lemieux, Yoshua Bengio, and Douglas Eck. Temporal pooling and multiscale learning for automatic annotation and ranking of music audio. In 12th Proceedings of the International Conference on Music Information Retrieval, 2011.

Jan Wülfing and Martin A Riedmiller. Unsupervised learning of local features for music classification. In 13th Proceedings of the International Conference on Music Information Retrieval, 2012.

Il-Young Jeong and Kyogu Lee. Learning temporal features using a deep neural network and its application to music genre classification. In 17th Proceedings of the Interna- tional Conference on Music Information Retrieval, 2016.

Yandre M.G. Costa, Luiz S. Oliveira, and Carlos N. Silla. An evaluation of convolutional neural networks for music classification using spectrograms. Applied Soft Computing, 52:28 – 38, 2017.

S. Dubnov. Generalization of spectral flatness measure for non-gaussian linear processes. IEEE Signal Processing Letters, 11(8):698–701, Aug 2004.

M. Hunt, M. Lennig, and P. Mermelstein. Experiments in syllable-based recognition of continuous speech. In ICASSP ’80. IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 5, pages 880–883, April 1980.

Lawrence R Rabiner and Biing-Hwang Juang. An introduction to hidden markov models. ieee assp magazine, 3(1):4–16, 1986.

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9:1735–80, 12 1997.

Bob L Sturm. A survey of evaluation in music genre recognition. In International Workshop on Adaptive Multimedia Retrieval, pages 29–66. Springer, 2012.

Helge Homburg, Ingo Mierswa, Bülent Möller, Katharina Morik, and Michael Wurst. A benchmark dataset for audio classification and clustering. pages 528–531, 01 2005.

Helge Homburg, Ingo Mierswa, Bülent Möller, Katharina Morik, and Michael Wurst. A benchmark dataset for audio classification and clustering. pages 528–531, 01 2005.

Elias Pampalk, Arthur Flexer, and Gerhard Widmer. Improvements of audio-based music similarity and genre classificaton. In Proceedings of the 6th International Conference on Music Information Retrieval, 2005.

Carlos Silla, Alessandro Koerich, and Celso Kaestner. The latin music database. pages 451–456, 01 2008.