[논문] Imaginary Voice: Face-Styled Diffusion Model for Text-to-Speech
https://arxiv.org/abs/2302.13700 Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech The goal of this work is zero-shot text-to-speech synthesis, with speaking styles and voices learnt from facial characteristics. Inspired by the natural fact that people can imagine the voice of someone when they look at his or her face, we introduce a fac arxiv.org 해당 논문을 보고 작성했습니다. Abstract 이 논문에서는..
연구실 공부
2024. 4. 22.