FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making

cs.AI updates on arXiv.org 07月18日 12:13

FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making

本文提出一种名为FOUNDER的框架，结合基础模型和动态模型，实现开放性任务在奖励无关的环境中的解决，在多任务视觉控制基准测试中表现优异。

arXiv:2507.12496v1 Announce Type: cross Abstract: Foundation Models (FMs) and World Models (WMs) offer complementary strengths in task generalization at different levels. In this work, we propose FOUNDER, a framework that integrates the generalizable knowledge embedded in FMs with the dynamic modeling capabilities of WMs to enable open-ended task solving in embodied environments in a reward-free manner. We learn a mapping function that grounds FM representations in the WM state space, effectively inferring the agent's physical states in the world simulator from external observations. This mapping enables the learning of a goal-conditioned policy through imagination during behavior learning, with the mapped task serving as the goal state. Our method leverages the predicted temporal distance to the goal state as an informative reward signal. FOUNDER demonstrates superior performance on various multi-task offline visual control benchmarks, excelling in capturing the deep-level semantics of tasks specified by text or videos, particularly in scenarios involving complex observations or domain gaps where prior methods struggle. The consistency of our learned reward function with the ground-truth reward is also empirically validated. Our project website is https://sites.google.com/view/founder-rl.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

基础模型动态模型开放性任务视觉控制奖励无关

相关文章

Comment on What should the UK’s £100 million Foundation Model Taskforce do? by Import AI 334: Better distillation; the UK’s AI taskforce; money and AI | Import AI

Comment on What should the UK’s £100 million Foundation Model Taskforce do? by Government-issued digital money gets closer - The World News Papers

Paris-based AGI Startup The “H” Company Secures $220M in Seed Funding

AmbientGPT: An Open-Source and Multimodal MacOS Foundation Model GUI

Transparency in Foundation Models: The Next Step in Foundation Model Transparency Index FMTI

Synthetic Data Generation in Foundation Models and Differential Privacy: Three Papers from Microsoft Research

From Simple Rules to Smart Exploration: Intelligent Go-Explore IGE Bridges the Gap with Foundation Models in Autonomous Systems

Ask HN: "最佳 "法律硕士和基础模型教材推荐？

The Missing Piece: Combining Foundation Models and Open-Endedness for Artificial Superhuman Intelligence ASI

Microsoft Releases Florence-2: A Novel Vision Foundation Model with a Unified, Prompt-based Representation for a Variety of Computer Vision and Vision-Language Tasks