Instrumental Sensibility of Vocal Detector Based on Spectral Features
Detecting voice in a mixture of sound sources remains a challenging task in MIR research. The musical content can be perceived in many different ways as instrumentation varies. We evaluate how instrumentation affects singing voice detection in pieces using a standard spectral feature (MFCC). We trained Random Forest models with song remixes for specific subsets of sound sources, and compare it to models trained with the original songs. We thus present a preliminary analysis of the classification accuracy results.
J. Salamon and E. Gómez. Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6):1759–1770, Aug. 2012.
Kyungyun Lee, Keunwoo Choi, and Juhan Nam. Revisiting singing voice detection: a quantitative review and the future outlook. In 19th Int. Soc. for Music Info. Retrieval Conf., Paris, France, 2018.
R. M. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam, and J. P. Bello. Medleydb: A multitrack dataset for annotation-intensive mir research. In 15th Int. Soc. for Music Info. Retrieval Conf., pages 155–160, Taipei, Taiwan, Oct. 2014.
B. McFee, C. Raffel, D. Liang, D. P. W. Ellis, M. McVicar, E. Battenberg, and O. Nieto. librosa: Audio and music signal analysis in python. In Proc. 14th python in science conference, pages 18–25, 2015.
B. Whitman, G. Flake, and S. Lawrence. Artist detection in music with minnowmatch. In Proc. of the 2001 IEEE Signal Processing Society Workshop, pages 559–568. IEEE, 2001.