HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents

cs.AI updates on arXiv.org 08月05日 19:10

HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents

本文提出HyCodePolicy，一个将代码生成、几何定位、感知监控和迭代修复集成的混合语言控制框架，显著提升机器人操作策略的鲁棒性和样本效率，为多模态推理在自主决策流程中的应用提供了一种可扩展策略。

arXiv:2508.02629v1 Announce Type: cross Abstract: Recent advances in multimodal large language models (MLLMs) have enabled richer perceptual grounding for code policy generation in embodied agents. However, most existing systems lack effective mechanisms to adaptively monitor policy execution and repair codes during task completion. In this work, we introduce HyCodePolicy, a hybrid language-based control framework that systematically integrates code synthesis, geometric grounding, perceptual monitoring, and iterative repair into a closed-loop programming cycle for embodied agents. Technically, given a natural language instruction, our system first decomposes it into subgoals and generates an initial executable program grounded in object-centric geometric primitives. The program is then executed in simulation, while a vision-language model (VLM) observes selected checkpoints to detect and localize execution failures and infer failure reasons. By fusing structured execution traces capturing program-level events with VLM-based perceptual feedback, HyCodePolicy infers failure causes and repairs programs. This hybrid dual feedback mechanism enables self-correcting program synthesis with minimal human supervision. Our results demonstrate that HyCodePolicy significantly improves the robustness and sample efficiency of robot manipulation policies, offering a scalable strategy for integrating multimodal reasoning into autonomous decision-making pipelines.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

HyCodePolicy 机器人编程混合控制框架多模态推理自主决策

相关文章

Common Sense Reasoning in NLP with Vered Shwartz - #461

要不停地提醒自己，自己的主体性。I am. I choose to be. 这是你的人生，你有权利做选择，你是自由的。你不是被投射的物体，不是被炫耀的时尚单品，不是被资本...

模拟5亿年自然进化史，全新蛋白质大模型ESM3诞生，前Meta老将力作LeCun转赞

OpenAI凌晨突发“最具性价比”模型GPT-4o mini

被简单字谜“打回原形”：大模型只是单向推理者？

抄作业，就是放弃自己赚钱的能力。但是，99%的人又没有赚钱能力。都是亏货。所以，你选择了哪条路，就拥有哪一条人生。客观评价自己，抄就是完全抄。唯一代价就...

研发教育机器人，「鲸鱼机器人」获超亿元融资丨36 氪首发

多观察几天大盘您们不会吃亏的，遵循规律投资，有些投资者教不会的，因为他们把大盘的每天的阳线都当做底部启动了，等几天可能又会改变观点，我说观察大盘，让大...

【股指期货早盘收盘】沪深300股指期货（IF）主力合约涨0.22%，上证50股指期货（IH）主力合约涨0.41%，中证500股指期货（IC）主力合约跌0.18%，中证1000股指期货...

【国债期货早盘收盘】2年期国债期货（TS）主力合约涨0.03%，5年期国债期货（TF）主力合约涨0.09%，10年期国债期货（T）主力合约涨0.13%，30年期国债期货（TL）主...