Automatic onset detection using convolutional neural networks
A very significant task for music research is to estimate instants when meaningful events begin (onset) and when they end (offset). Onset detection is widely applied in many fields: electrocardiograms, seismographic data, stock market results and many Music Information Research(MIR) tasks, such as Automatic Music Transcription, Rhythm Detection, Speech Recognition, etc. Automatic Onset Detection(AOD) received, recently, a huge contribution coming from Artificial Intelligence (AI) methods, mainly Machine Learning and Deep Learning. In this work, the use of Convolutional Neural Networks (CNN) is explored by adapting its original architecture in order to apply the approach to automatic onset detection on audio musical signals. We used a CNN network for onset detection on a very general dataset, well acknowledged by the MIR community, and examined the accuracy of the method by comparison to ground truth data published by the dataset. The results are promising and outperform another methods of musical onset detection.
Sander Dieleman, Philémon Brakel, and Benjamin Schrauwen. Audio-based music classification with a pretrained convolutional network. In 12th International Society for Music Information Retrieval Conference (ISMIR2011), pages 669–674. University of Miami, 2011.
Jan Schlüter and Sebastian Böck. Musical onset detection with convolutional neural networks. In 6th international workshop on machine learning and music (MML), Prague, Czech Republic, 2013.
Jan Schlüter and Sebastian Böck. Improved musical onset detection with convolutional neural networks. In 2014 ieee international conference on acoustics, speech and signal processing (icassp), pages 6979–6983. IEEE, 2014.
Joos Vos and Rudolf Rasch. The perceptual onset of musical tones. Perception & psychophysics, 29(4):323–335, 1981.
Juan Pablo Bello, Laurent Daudet, Samer Abdallah, Chris Duxbury, Mike Davies, and Mark B Sandler. A tutorial on onset detection in music signals. IEEE Transactions on speech and audio processing, 13(5):1035–1047, 2005.
Meinard Müller. Fundamentals of music processing: Audio, analysis, algorithms, applications. Springer, 2015.
What’s wrong with spectrograms and cnns for audio processing? https://t.co/qequ0e3ll8. (Accessed on 05/14/2019).
Florian Eyben, Sebastian Böck, Björn Schuller, and Alex Graves. Universal onset detection with bidirectional longshort term memory neural networks. In Proc. 11th Intern. Soc. for Music Information Retrieval Conference, ISMIR, Utrecht, The Netherlands, pages 589–594, 2010.
André Holzapfel, Yannis Stylianou, Ali C Gedik, and Barış Bozkurt. Three dimensions of pitched instrument onset detection. IEEE Transactions on Audio, Speech, and Language Processing, 18(6):1517–1527, 2010.