MarkTechPost@AI 2024年10月20日
Differentiable Rendering of Robots (Dr. Robot): A Robot Self-Model Differentiable from Its Visual Appearance to Its Control Parameters
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了机器人领域中视觉与动作数据的关联及存在的问题,以及哥伦比亚大学和斯坦福大学研究者提出的Dr. Robot方法。该方法通过多种技术实现可微机器人控制,能让机器人从视觉基础模型中学习动作,在多项实验中表现出色,为机器人控制任务中的视觉学习奠定新基础。

🎈Dr. Robot是一种可微机器人渲染方法,集成了高斯溅射、隐式线性混合蒙皮和姿势条件外观变形,能从机器人图像计算梯度并将其传递到动作控制参数,使其与各种机器人形式和自由度兼容。

💪该方法的核心组件包括用高斯溅射在规范姿势中建模机器人的外观和几何形状,以及用隐式LBS使模型适应不同机器人姿势。机器人的外观由一组3D高斯表示,根据机器人的姿势进行变换和变形。

🌟Dr. Robot在机器人姿势重建任务中表现优于现有技术,在从视频中重建机器人姿势方面具有更高的准确性,在估计关节角度方面比现有方法高出30%以上,还在机器人动作规划等应用中得到展示。

Visual and action data are interconnected in robotic tasks, forming a perception-action loop. Robots rely on control parameters for movement, while VFMs excel in processing visual data. However, a modality gap exists between visual and action data arising from the fundamental differences in their sensory modalities, abstraction levels, temporal dynamics, contextual dependence, and susceptibility to noise. These differences make it challenging to directly relate visual perception to action control, requiring intermediate representations or learning algorithms to bridge the gap. Currently, robots are represented by geometric primitives like triangle meshes, and kinematic structures describe their morphology. While VFMs provide generalizable control signals, passing these signals to robots has been challenging. 

Researchers from Columbia University and Stanford University proposed “Dr. Robot,” a differentiable robot rendering method that integrates Gaussians Splatting, implicit linear blend skinning (LBS), and pose-conditioned appearance deformation to enable differentiable robot control. The key innovation is the ability to calculate gradients from robot images and transfer them to action control parameters, making it compatible with various robot forms and degrees of freedom. This method allows robots to learn actions from VFMs, closing the gap between visual inputs and control actions, which was previously hard to achieve.

The core components of Dr. Robot include Gaussian splatting to model the robot’s appearance and geometry in a canonical pose and implicit LBS to adapt this model to different robot poses. The robot’s appearance is represented by a set of 3D Gaussians, which are transformed and deformed based on the robot’s pose. A differentiable forward kinematics model allows these changes to be tracked, while a deformation function adapts the robot’s appearance in real time. This method produces high-quality gradients for learning robotic control from visual data, as demonstrated by outperforming the state-of-the-art in robot pose reconstruction tasks and planning robot actions through VFMs. In various evaluation experiments, Dr. Robot shows better accuracy in robot pose reconstruction from videos and outperforms existing methods by over 30% in estimating joint angles. The framework is also demonstrated in applications such as robot action planning using language prompts and motion retargeting.

In conclusion, the research presents a robust solution to control robots using visual foundation models by developing a fully differentiable robot representation. Dr. Robot serves as a bridge between the visual world and robotic action space, allowing effective planning and control directly from images and pixels. By creating an efficient and flexible method that integrates forward kinematics, Gaussians Splatting, and implicit LBS, this paper sets a new foundation for using vision-based learning in robotic control tasks.


Check out the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 50k+ ML SubReddit.

[Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted)

The post Differentiable Rendering of Robots (Dr. Robot): A Robot Self-Model Differentiable from Its Visual Appearance to Its Control Parameters appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Dr. Robot 机器人控制 可微渲染
相关文章