Toward a Real-Time Framework for Accurate Monocular 3D Human Pose Estimation with Geometric Priors

cs.AI updates on arXiv.org 07月24日 13:31

Toward a Real-Time Framework for Accurate Monocular 3D Human Pose Estimation with Geometric Priors

提出结合实时2D关键点检测与几何感知2D到3D提升的框架，利用已知相机内参和特定解剖先验，实现大规模、合理的2D-3D训练对，旨在提高边缘设备上3D人体运动捕捉的准确性、可解释性和部署性。

arXiv:2507.16850v1 Announce Type: cross Abstract: Monocular 3D human pose estimation remains a challenging and ill-posed problem, particularly in real-time settings and unconstrained environments. While direct imageto-3D approaches require large annotated datasets and heavy models, 2D-to-3D lifting offers a more lightweight and flexible alternative-especially when enhanced with prior knowledge. In this work, we propose a framework that combines real-time 2D keypoint detection with geometry-aware 2D-to-3D lifting, explicitly leveraging known camera intrinsics and subject-specific anatomical priors. Our approach builds on recent advances in self-calibration and biomechanically-constrained inverse kinematics to generate large-scale, plausible 2D-3D training pairs from MoCap and synthetic datasets. We discuss how these ingredients can enable fast, personalized, and accurate 3D pose estimation from monocular images without requiring specialized hardware. This proposal aims to foster discussion on bridging data-driven learning and model-based priors to improve accuracy, interpretability, and deployability of 3D human motion capture on edge devices in the wild.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

3D人体姿态估计单目图像 2D到3D提升边缘设备

相关文章

Stable Diffusion and LLMs at the Edge with Jilei Hou - #633

AI at the Edge at Qualcomm with Gary Brotman - TWiML Talk #223

开源版OpenAI再出「神作」，小模型吊打Llama 3，Ministral系列问世，边缘AI革命开启

7B新王登基，Zamba 2完胜同级模型，推理效率比Llama 3提升20%，内存用量更少

Broadcom Debuts VeloRAIN in VeloCloud Portfolio for Optimized AI and Edge Workloads

手机也能玩RAG？谷歌EdgeRAG做到了~

OpenBMB Just Released MiniCPM-o 2.6: A New 8B Parameters, Any-to-Any Multimodal Model that can Understand Vision, Speech, and Language and Runs on Edge Devices

Microsoft AI Researchers Introduce Advanced Low-Bit Quantization Techniques to Enable Efficient LLM Deployment on Edge Devices without High Computational Costs

五眼联盟发布网络边缘设备安全指南，强调取证可见性

疑似僵尸网络使用暴力攻击瞄准边缘设备