State of art of real-time singing voice synthesis
Resumo
This paper describes the state of art of realtime singing voice synthesis and presents its concept, applications and technical aspects. A technological mapping and a literature review are made in order to indicate the latest developments in this area. We made a brief comparative analysis among the selected works. Finally, we have discussed challenges and future research problems. Keywords: Real-time singing voice synthesis, Sound Synthesis, TTS, MIDI, Computer Music.
Referências
ALIVIZATOU-BARAKOU, Marilena et al. Intangible cultural heritage and new technologies: challenges and opportunities for cultural preservation and development. In: Mixed Reality and Gamification for Cultural Heritage. Springer, Cham, 2017. p. 129-158.
KENMOCHI, Hideki. Singing synthesis as a new musical instrument. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2012. p. 5385-5388.
KAGAMI, Shota et al. Development of Realtime Japanese Vocal Keyboard. In: Information Processing Society of Japan INTERACTION, pages 837-842, 2012.
MACNEILAGE, Peter. The frame/content theory of evolution of speech production. Behavioral and brain sciences, v. 21, n.4, p. 499-511, 1998.
DELALEZ, Samuel; D’ALESSANDRO, Christophe. Adjusting the frame: Biphasic performative control of speech rhythm. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pages 864-868, 2017.
LOY, Gareth. Musimathics: the Mathematical Fundamentals of Music., MIT Press, 2011.
BADER, Rolf (Ed.). Springer handbook of systematic musicology. Springer, 2018.
BRUM, Leonardo Araujo Zoehler. Technical aspects of concatenation-based singing voice synthesis. Scientia Plena, v. 8, n. 3 (a), 2012.
HOWARD, David. Virtual Choirs. In: The Routledge Companion to Music, Technology, and Education. Routledge, 2017. p. 305-314.
OURA, Keiichiro et al. Recent development of the HMM-based singing voice synthesis system—Sinsy. In: Seventh ISCA Workshop on Speech Synthesis. 2010.
CHAN, Paul Yaozhu, et al. SERAPHIM: A wavetable synthesis system with 3D lip animation for real-time speech and singing applications on mobile platforms. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pages 1225-1229, 2016.
PETERSEN, Kai et al. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology, v. 64, p. 1-18, 2015.
KUBOZONO, Haruo (Ed.). Handbook of Japanese phonetics and phonology. Walter de Gruyter GmbH & Co KG, 2015.
FEUGÈRE, Lionel et al. Cantor Digitalis: chironomic parametric synthesis of singing. EURASIP Journal on Audio, Speech, and Music Processing, v. 2017, n. 1, p. 2, 2017.
LE BEUX, Sylvain et al. Chorus digitalis: Experiments in chironomic choir singing. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pages 2005-2008, 2011.
DONG, Minghui et al. I2R speech2singing perfects everyone's singing. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH pages 2148-2149, 2014.
MORISE, Masanori et al. v. morish’09: A morphing-based singing design interface for vocal melodies. In: International Conference on Entertainment Computing. Springer, Berlin, Heidelberg, 2009. p. 185-190.
GU, Hung-Yan; LIAO, Huang-Liang. Mandarin singing voice synthesis using an HNM based scheme. In: 2008 Congress on Image and Signal Processing. IEEE, 2008. p. 347-351.
YU, Jun. A Real-Time 3D Visual Singing Synthesis: From Appearance to Internal Articulators. In: International Conference on Multimedia Modeling. Springer, Cham, 2017. p. 53-64.