Training a convolutional neural network for note onset detection on the clarinet

  • Tairone N. Magalhães Universidade Federal de Minas Gerais
  • Mauricio A. Loureiro Universidade Federal de Minas Gerais


Although computational models for note onset detection have improved drastically in the last decade, mainly due to the advances brought by the field of Deep Learning, such models have not been perfected yet. When dealing with specific data, like clarinet recordings, those models still produce a significant number of false positives and negatives. In this paper, we evaluate pre-trained onset detection models from the library madmom on a dataset composed of solo clarinet recordings, in particular, to investigate their performance on this kind of data. Moreover, we use the clarinet dataset to train the same neural network (CNN) employed in one of those models, to investigate whether training the model on this specific data leads to an improvement when dealing with clarinet recordings. The results obtained from the model trained strictly on clarinet data are considerably better than those from models trained on generic data.

Palavras-chave: Artificial Intelligence, A-Life and Evolutionary Music Systems, Music Information Retrieval


Chris Duxbury, Mark Sandler, and Mike Davies. A hybrid approach to musical note onset detection. Computer, pages 33–38, 2002.

Juan Pablo Bello, Laurent Daudet, Samer Abdallah, Chris Duxbury, Mike Davies, and Mark B. Sandler. A tutorial on onset detection in music signals. IEEE Transactions on Speech and Audio Processing, 13(5):1035–1046, 2005.

Nick Collins. Using a pitch detector for onset detection. Proceedings of the International Symposium on Music Information Retrieval, pages 100–106, 2005.

Ruohua Zhou and J.D. Reiss. Music onset detection combining energy-based and pitch-based approaches. Proc. MIREX Audio Onset Detection Contest, 2007.

Harvey Thornburg, Randal J. Leistikow, and Jonathan Berger. Melody extraction and musical onset detection via probabilistic models of framewise STFT peak data. IEEE Transactions on Audio, Speech and Language Processing, 15(4):1257– 1272, 2007.

Sebastian Böck, Filip Korzeniowski, Jan Schlüter, Florian Krebs, and Gerhard Widmer. madmom: a new Python Audio and Music Signal Processing Library. In Proceedings of the 24th ACM International Conference on Multimedia, pages 1174–1178, Amsterdam, The Netherlands, 2016.

Jan Schlüter and Sebastian Böck. Improved musical onset detection with Convolutional Neural Networks. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6979–6983, Prague, Czech Republic, may 2014. IEEE.

Florian Eyben, Sebastian Böck, Björn Schuller, and Alex Graves. Universal Onset Detection with Bidirectional LongShort Term Memory Neural Networks. Proceedings 11th International Society for Music Information Retrieval Conference, ISMIR 2010, (June 2017):589–594, 2010.

Sebastian Böck and Gerhard Widmer. Maximum Filter Vibrato Suppression for Onset Detection. In Proc. of the 16th Int. Conference on Digital Audio Effects (DAFx-13), Maynooth, Ireland, September 2-5, 2013, pages 1–7, 2013.
MAGALHÃES, Tairone N.; LOUREIRO, Mauricio A.. Training a convolutional neural network for note onset detection on the clarinet. In: SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO MUSICAL (SBCM), 18. , 2021, Recife. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 55-59. DOI: