State of art of real-time singing voice synthesis

Leonardo Brum; Edward David Moreno

doi:10.5753/sbcm.2019.10422

Leonardo Brum Federal University of Sergipe
Edward David Moreno Federal University of Sergipe

DOI: https://doi.org/10.5753/sbcm.2019.10422

Resumo

This paper describes the state of art of realtime singing voice synthesis and presents its concept, applications and technical aspects. A technological mapping and a literature review are made in order to indicate the latest developments in this area. We made a brief comparative analysis among the selected works. Finally, we have discussed challenges and future research problems. Keywords: Real-time singing voice synthesis, Sound Synthesis, TTS, MIDI, Computer Music.

Palavras-chave: Music Analysis and Synthesis, Real-time Interactive Systems, Software Systems and Languages for Sound and Music

Referências

KHAN, Najeeb Ullah; LEE, Jung Chul. HMM Based Duration Control for Singing TTS. In: Advances in Computer Science and Ubiquitous Computing. Springer, Singapore, 2015. p. 137-143.

ALIVIZATOU-BARAKOU, Marilena et al. Intangible cultural heritage and new technologies: challenges and opportunities for cultural preservation and development. In: Mixed Reality and Gamification for Cultural Heritage. Springer, Cham, 2017. p. 129-158.

KENMOCHI, Hideki. Singing synthesis as a new musical instrument. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2012. p. 5385-5388.

KAGAMI, Shota et al. Development of Realtime Japanese Vocal Keyboard. In: Information Processing Society of Japan INTERACTION, pages 837-842, 2012.

MACNEILAGE, Peter. The frame/content theory of evolution of speech production. Behavioral and brain sciences, v. 21, n.4, p. 499-511, 1998.

DELALEZ, Samuel; D’ALESSANDRO, Christophe. Adjusting the frame: Biphasic performative control of speech rhythm. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pages 864-868, 2017.

LOY, Gareth. Musimathics: the Mathematical Fundamentals of Music., MIT Press, 2011.

BADER, Rolf (Ed.). Springer handbook of systematic musicology. Springer, 2018.

BRUM, Leonardo Araujo Zoehler. Technical aspects of concatenation-based singing voice synthesis. Scientia Plena, v. 8, n. 3 (a), 2012.

HOWARD, David. Virtual Choirs. In: The Routledge Companion to Music, Technology, and Education. Routledge, 2017. p. 305-314.

OURA, Keiichiro et al. Recent development of the HMM-based singing voice synthesis system—Sinsy. In: Seventh ISCA Workshop on Speech Synthesis. 2010.

CHAN, Paul Yaozhu, et al. SERAPHIM: A wavetable synthesis system with 3D lip animation for real-time speech and singing applications on mobile platforms. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pages 1225-1229, 2016.

PETERSEN, Kai et al. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology, v. 64, p. 1-18, 2015.

KUBOZONO, Haruo (Ed.). Handbook of Japanese phonetics and phonology. Walter de Gruyter GmbH & Co KG, 2015.

FEUGÈRE, Lionel et al. Cantor Digitalis: chironomic parametric synthesis of singing. EURASIP Journal on Audio, Speech, and Music Processing, v. 2017, n. 1, p. 2, 2017.

LE BEUX, Sylvain et al. Chorus digitalis: Experiments in chironomic choir singing. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pages 2005-2008, 2011.

DONG, Minghui et al. I2R speech2singing perfects everyone's singing. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH pages 2148-2149, 2014.

MORISE, Masanori et al. v. morish’09: A morphing-based singing design interface for vocal melodies. In: International Conference on Entertainment Computing. Springer, Berlin, Heidelberg, 2009. p. 185-190.

GU, Hung-Yan; LIAO, Huang-Liang. Mandarin singing voice synthesis using an HNM based scheme. In: 2008 Congress on Image and Signal Processing. IEEE, 2008. p. 347-351.

YU, Jun. A Real-Time 3D Visual Singing Synthesis: From Appearance to Internal Articulators. In: International Conference on Multimedia Modeling. Springer, Cham, 2017. p. 53-64.