본문 바로가기

Jeongwooyeol

Notice

Recent Posts

Popular Posts

Recent Comments

Link

Calendar

Tags

더보기

Archives

Visits

Today

Yesterday

전체 글

[논문] StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep Embeddings https://arxiv.org/abs/2309.07592 StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep EmbeddingsVoice conversion (VC) transforms an utterance to sound like another person without changing the linguistic content. A recently proposed generative adversarial network-based VC method, StarGANv2-VC is very successful in generating natural-sounding conversioarxiv.org해당 논문을 보고 작성했습니다. Abs.. 연구실 공부 2025. 4. 4.

[논문] FastSpeech2: Fast and High-Quality End-to-End Text to Speech https://arxiv.org/abs/2006.04558 FastSpeech 2: Fast and High-Quality End-to-End Text to SpeechNon-autoregressive text to speech (TTS) models such as FastSpeech can synthesize speech significantly faster than previous autoregressive models with comparable quality. The training of FastSpeech model relies on an autoregressive teacher model for duratioarxiv.org해당 논문을 보고 작성했습니다. AbstractFastSpeech와 같.. 연구실 공부 2025. 4. 1.

[논문] Revealing Emotional Clusters in Speaker Embeddings : A Contrastive Learning Strategy for Speech Emotion Recognition https://arxiv.org/html/2401.11017v1 Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion RecognitionLicense: CC BY 4.0 arXiv:2401.11017v1 [eess.AS] 19 Jan 2024 \useunder Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition Abstract Speaker embeddings carry valuable emotion-related info.. 연구실 공부 2025. 3. 31.

[논문] VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion https://arxiv.org/abs/2106.10132 VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice COne-shot voice conversion (VC), which performs conversion across arbitrary speakers with only a single target-speaker utterance for reference, can be effectively achieved by speech representation disentanglement. Existing work generally .. 연구실 공부 2025. 3. 26.

[논문] Speech Emotion Diarization: Which Emotion Appears When? https://arxiv.org/abs/2306.12991 Speech Emotion Diarization: Which Emotion Appears When?Speech Emotion Recognition (SER) typically relies on utterance-level solutions. However, emotions conveyed through speech should be considered as discrete speech events with definite temporal boundaries, rather than attributes of the entire utterance. To rarxiv.org해당 논문을 보고 작성했습니다. Abstractspeech emotion reco.. 연구실 공부 2025. 3. 12.

[논문] Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features https://arxiv.org/abs/2211.04710 Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation FeaturesVoice conversion for highly expressive speech is challenging. Current approaches struggle with the balancing between speaker similarity, intelligibility and expressiveness. To address this problem, we propose Expressive-VC, a novel end-to-end voice conve.. 연구실 공부 2025. 3. 10.

[논문] Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer https://arxiv.org/abs/2107.03748 Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style TransferTraditional voice conversion(VC) has been focused on speaker identity conversion for speech with a neutral expression. We note that emotional expression plays an essential role in daily communication, and the emotional style of speech can be speaker-dependarxiv.org 해당 .. 연구실 공부 2025. 3. 7.

[논문] Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification https://arxiv.org/abs/2404.01805 Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal ClassificationEmotion detection in textual data has received growing interest in recent years, as it is pivotal for developing empathetic human-computer interaction systems. This paper introduces a method for categorizing emotions from text, which acknowledges and diffearxiv.org해당 논문을 보고 .. 연구실 공부 2025. 3. 5.

이전 1 2 3 4 ··· 29 다음

728x90

티스토리툴바