[논문] Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model
https://arxiv.org/abs/2303.00091 Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model Automatic Speech Recognition (ASR) is a technology that converts spoken words into text, facilitating interaction between humans and machines. One of the most common applications of ASR is Speech-To-Text (STT) technology, which simplifies user workflows by arxiv.org 해당 논문을 보고 작성했습니다..
연구실 공부
2024. 4. 16.