Adaptive Knowledge Distillation for Device-Directed Speech Detection

cs.AI updates on arXiv.org 5小时前

Adaptive Knowledge Distillation for Device-Directed Speech Detection

本文提出一种自适应知识蒸馏方法，用于提升设备导向语音检测（DDSD）的准确性，通过在预训练声学编码器上应用特定任务适配器，实现高效部署，显著提高DDSD在关键词和关键词无（后续）调用中的性能。

arXiv:2508.02801v1 Announce Type: cross Abstract: Device-directed speech detection (DDSD) is a binary classification task that separates the user's queries to a voice assistant (VA) from background speech or side conversations. This is important for achieving naturalistic user experience. To this end, we propose knowledge distillation (KD) to enhance DDSD accuracy while ensuring efficient deployment. Specifically, we introduce a novel adaptive KD method that transfers knowledge from general representations of an ASR large pre-trained acoustic encoder (teacher). We apply task-specific adapters, on top of the (frozen) teacher encoder, trained jointly with the student model on DDSD. We demonstrate that the proposed adaptive KD outperforms the student model without distillation in the keyword and keyword-free (follow-up) invocations, with an improvement of +26% and +19% in terms of Equal Error Rate, respectively. We also show that this approach generalizes across the transformer and conformer-based model architectures.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

知识蒸馏语音识别设备导向语音检测

相关文章

Delivering Neural Speech Services at Scale with Li Jiang - #522

Acoustic Word Embeddings for Low Resource Speech Processing with Herman Kamper - TWiML Talk #191

Advancements in Knowledge Distillation and Multi-Teacher Learning: Introducing AM-RADIO Framework

有光科技完成B轮融资

Building a better sarcasm detector

AI headphones let wearer listen to a single person in a crowd, by looking at them just once

中国电信人工智能研究院发布支持超多方言语音识别大模型

ChatGPT-4o发布了，所有人都可以免费用

SecWiki News 2024-06-12 Review

麦当劳“炒掉”AI 点餐员，叫停与 IBM 合作的自动点餐测试项目