[논문] Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-based Zero-Shot Text-To-Speech
https://arxiv.org/abs/2407.12229 Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-SpeechPeople change their tones of voice, often accompanied by nonverbal vocalizations (NVs) such as laughter and cries, to convey rich emotions. However, most text-to-speech (TTS) systems lack the capability to generate speech with rich emotions, including NVs..
연구실 공부
2025. 3. 4.