cs.AI updates on arXiv.org 6小时前
SpectroStream: A Versatile Neural Codec for General Audio
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出SpectroStream,一种全频段多通道神经音频编解码器,在4-16 kbps比特率下实现48 kHz立体音乐的高质量重建,通过时间-频率域的音频表示和延迟融合策略提高音频质量。

arXiv:2508.05207v1 Announce Type: cross Abstract: We propose SpectroStream, a full-band multi-channel neural audio codec. Successor to the well-established SoundStream, SpectroStream extends its capability beyond 24 kHz monophonic audio and enables high-quality reconstruction of 48 kHz stereo music at bit rates of 4--16 kbps. This is accomplished with a new neural architecture that leverages audio representation in the time-frequency domain, which leads to better audio quality especially at higher sample rate. The model also uses a delayed-fusion strategy to handle multi-channel audio, which is crucial in balancing per-channel acoustic quality and cross-channel phase consistency.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

SpectroStream 神经音频编解码器 音频质量 多通道音频
相关文章