JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1

cs.AI updates on arXiv.org 07月29日 12:21

JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1

本文介绍了一种新型多模态数据集和评估协议，用于评估全身动作与语音同步生成，揭示了当前技术在面部/手部与全身表现上的性能差异，为未来研究指明方向。

arXiv:2507.20987v1 Announce Type: cross Abstract: Recent advances in diffusion-based video generation have enabled photo-realistic short clips, but current methods still struggle to achieve multi-modal consistency when jointly generating whole-body motion and natural speech. Current approaches lack comprehensive eval- uation frameworks that assess both visual and audio quality, and there are insufficient benchmarks for region- specific performance analysis. To address these gaps, we introduce the Joint Whole-Body Talking Avatar and Speech Generation Version I(JWB-DH-V1), comprising a large-scale multi-modal dataset with 10,000 unique identities across 2 million video samples, and an evalua- tion protocol for assessing joint audio-video generation of whole-body animatable avatars. Our evaluation of SOTA models reveals consistent performance disparities between face/hand-centric and whole-body performance, which incidates essential areas for future research. The dataset and evaluation tools are publicly available at https://github.com/deepreasonings/WholeBodyBenchmark.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

多模态数据集全身动作生成语音同步评估协议性能差异

相关文章

MINT-1T: An Open-Source Trillion Token Multimodal Interleaved Dataset and a Key Component for Training Large Multimodal Models LMMs

斯迪克（300806）：苹果16重大变化，OCA胶膜龙头，独供大客户尽享弹性。

MINT-1T Dataset Released: A Multimodal Dataset with One Trillion Tokens to Build Large Multimodal Models

Nature曝惊人内幕：论文被天价卖出喂AI！出版商狂赚上亿，作者0收入

解开分子结构：用于化学的多模态光谱数据集

为什么劝你先别买DDR5内存：认清自己需求

ByteDance Researchers Release InfiMM-WebMath-40: An Open Multimodal Dataset Designed for Complex Mathematical Reasoning

曝PS5 Pro恐因价格高致预购量不佳：甚至没引来黄牛

iPhone 16 Pro Max对决三星S24 Ultra 谁是真正的机皇

M3 MacBook Air vs. M3 MacBook Pro: Which Mac is best for you?