2025/04 [논문] StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep Embeddings https://arxiv.org/abs/2309.07592 StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep EmbeddingsVoice conversion (VC) transforms an utterance to sound like another person without changing the linguistic content. A recently proposed generative adversarial network-based VC method, StarGANv2-VC is very successful in generating natural-sounding conversioarxiv.org해당 논문을 보고 작성했습니다. Abs.. 연구실 공부 2025. 4. 4. [논문] FastSpeech2: Fast and High-Quality End-to-End Text to Speech https://arxiv.org/abs/2006.04558 FastSpeech 2: Fast and High-Quality End-to-End Text to SpeechNon-autoregressive text to speech (TTS) models such as FastSpeech can synthesize speech significantly faster than previous autoregressive models with comparable quality. The training of FastSpeech model relies on an autoregressive teacher model for duratioarxiv.org해당 논문을 보고 작성했습니다. AbstractFastSpeech와 같.. 연구실 공부 2025. 4. 1. 이전 1 다음 728x90