Towards Robust Speech Recognition for Jamaican Patois Music Transcription

cs.AI updates on arXiv.org 07月24日 13:31

针对牙买加Patois音乐语音识别系统性能不佳的问题，研究人员通过手动转录大量Patois音乐数据，优化了ASR模型，并提出了针对Whisper模型的性能扩展法则，旨在提升Patois音乐的可访问性和语言模型发展。

arXiv:2507.16834v1 Announce Type: cross Abstract: Although Jamaican Patois is a widely spoken language, current speech recognition systems perform poorly on Patois music, producing inaccurate captions that limit accessibility and hinder downstream applications. In this work, we take a data-centric approach to this problem by curating more than 40 hours of manually transcribed Patois music. We use this dataset to fine-tune state-of-the-art automatic speech recognition (ASR) models, and use the results to develop scaling laws for the performance of Whisper models on Jamaican Patois audio. We hope that this work will have a positive impact on the accessibility of Jamaican Patois music and the future of Jamaican Patois language modeling.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签