[논문] MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis
https://arxiv.org/abs/2404.18398 MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech SynthesisEmotional Text-to-Speech (E-TTS) synthesis has gained significant attention in recent years due to its potential to enhance human-computer interaction. However, current E-TTS approaches often struggle to capture the complexity of human emotions, primarilyarxiv.org 해당 논문을 ..
연구실 공부
2024. 6. 17.