MarkTechPost@AI 02月13日
Meta AI Introduces PARTNR: A Research Framework Supporting Seamless Human-Robot Collaboration in Multi-Agent Tasks
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Meta AI推出了PARTNR,这是一个大规模基准,旨在评估模拟环境中人机协作的能力。该基准包含10万个自然语言任务,涵盖60个模拟家庭和5819个独特对象,专门评估包含空间、时间和异构约束的任务。PARTNR通过利用LLM和模拟在环验证的半自动化流程,确保了任务生成过程的现实性和可扩展性。实验表明,目前基于LLM的规划代理在协调、任务跟踪和错误恢复方面存在显著局限,与人类团队相比,完成任务所需的步骤更多,成功率也较低。PARTNR旨在为评估AI与人类伙伴有效协作的能力设定标准,并推动协作式具身AI系统的创新。

🏠PARTNR基准包含10万个自然语言任务,涵盖60个模拟家庭和5819个独特对象,专门评估包含空间、时间和异构约束的任务,旨在模拟真实世界人机协作的复杂性。

🤖PARTNR中的任务分为四类:无约束任务(执行顺序灵活)、空间任务(需要特定对象定位)、时间任务(需要有序执行)和异构任务(涉及超出机器人能力的操作,需要人工干预)。

📈对PARTNR上基于LLM的规划代理的评估显示,在非特权条件下,最先进的LLM的成功率仅为30%,而人类单独完成任务的成功率为93%。这突显了当前AI模型在人机协作方面的差距。

⚙️研究表明,微调较小的LLM可以达到与大九倍的模型相当的性能,同时推理速度快8.6倍。这表明在人机协作中,模型效率和规模之间需要权衡。

Human-robot collaboration focuses on developing intelligent systems working alongside humans in dynamic environments. Researchers aim to build robots capable of understanding and executing natural language instructions while adapting to constraints such as spatial positioning, task sequencing, and capability-sharing between humans and machines. This field significantly advances robotics for household assistance, healthcare, and industrial automation, where efficiency and adaptability are crucial for seamless integration.

A major challenge in human-robot collaboration is the lack of a comprehensive benchmark to evaluate planning and reasoning abilities in multi-agent tasks. While previous models have addressed navigation and single-agent interactions, they fail to capture real-world complexities where robots must coordinate with humans. Many existing approaches do not account for real-time task tracking, partner adaptation, and effective error recovery. The absence of an established standard makes it difficult to assess and improve collaborative AI performance in interactive settings systematically.

Current approaches in embodied AI often focus on single-agent task execution, disregarding the necessity of coordination in multi-agent scenarios. Some methods rely on templated task instructions, limiting scalability and task diversity, while others depend on manually crafted evaluation functions, making large-scale assessments impractical. Despite advancements, state-of-the-art large language models (LLMs) struggle with task tracking, coordination, and recovery from execution failures. These limitations hinder their ability to function efficiently in human-centric environments where adaptability and precise task execution are essential.

Researchers at FAIR Meta have introduced PARTNR (Planning And Reasoning Tasks in humaN-Robot collaboration), a large-scale benchmark designed to assess human-robot coordination in simulated environments. PARTNR comprises 100,000 natural language tasks, spanning 60 simulated homes and 5,819 unique objects. The benchmark specifically evaluates tasks incorporating spatial, temporal, and heterogeneous constraints. Researchers ensured a realistic and scalable task generation process by leveraging a semi-automated pipeline integrating LLMs and simulation-in-the-loop validation. PARTNR aims to set a standard for evaluating AI’s ability to collaborate with human partners effectively.

Researchers generated task instructions and evaluation functions using LLMs to create the benchmark. These were then filtered through simulation to remove infeasible tasks. The final dataset underwent human-in-the-loop validation to enhance task diversity and ensure accuracy. The tasks in PARTNR fall into four categories: constraint-free, spatial, temporal, and heterogeneous. Constraint-free tasks allow flexibility in execution order, while spatial tasks require specific object positioning. Temporal tasks necessitate ordered execution, and heterogeneous tasks involve actions beyond the robot’s capability, requiring human intervention. These task structures introduce challenges in coordination, tracking, and execution accuracy.

Evaluations of LLM-based planning agents on PARTNR revealed significant limitations in coordination, task tracking, and error recovery. When paired with humans, LLM-guided robots required 1.5 times more steps than human-human teams and 1.1 times more steps than a single human to complete tasks. The success rate of state-of-the-art LLMs was only 30% under non-privileged conditions, compared to 93% when tasks were performed solely by humans. Moreover, fine-tuning smaller LLMs achieved performance comparable to models nine times larger while being 8.6 times faster at inference. In decentralized multi-agent settings, task completion required 1.3 times more steps than a single-agent scenario, demonstrating inefficiencies in current coordination mechanisms.

PARTNR highlights crucial gaps in existing AI-driven human-robot collaboration models, emphasizing better planning, tracking, and decision-making strategies. The findings indicate that despite advancements in AI, human-robot collaboration benchmarks require substantial improvements to bridge the performance disparity between AI models and humans. The structured evaluation framework offered by PARTNR provides a pathway for advancing AI’s ability to collaborate, plan, and execute tasks efficiently. Future research should focus on refining LLM-based planners, improving coordination mechanisms, and enhancing perception models to address current limitations in multi-agent interaction. PARTNR is a valuable resource for driving innovation in collaborative embodied AI systems.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 75k+ ML SubReddit.

Recommended Open-Source AI Platform: ‘IntellAgent is a An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System(Promoted)

The post Meta AI Introduces PARTNR: A Research Framework Supporting Seamless Human-Robot Collaboration in Multi-Agent Tasks appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人机协作 PARTNR LLM AI基准
相关文章