MarkTechPost@AI 2024年12月04日
Are LLMs Ready for Real-World Path Planning? A Critical Evaluation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了LLMs在车辆导航系统中的路径规划能力。研究发现,尽管一些研究认为LLMs有优势,但它们在实际应用中存在诸多问题,如对新环境和复杂场景适应性差,实验中多种LLMs模型在不同任务中均有错误,表明其不适用于实际导航,未来需针对性设计。

🎯LLMs在车辆导航系统中的应用及需评估的路径规划能力

💡传统方法的局限使LLMs受关注,但存在多种问题

🔬研究人员通过实验测试LLMs在实际路径规划中的表现

❌实验中LLMs模型在不同任务中均有错误,可靠性不足

Large Language Models (LLMs) are advanced AI systems trained on large amounts of data to understand and generate human-like language. As large language models (LLMs) increasingly integrate into vehicle navigation systems, it is important to understand their path-planning capability. In early 2024, many car manufacturers integrated AI-powered voice assistants into their vehicles, including infotainment control, navigation, climate management, and answering general knowledge questions. The ability of AI-powered voice assistants to plan real-world routes is one area that needs to be assessed for effective vehicle navigation management.

Traditional methods struggle with memory and efficiency as maps grow, leading to interest in using LLMs. Some studies suggest LLMs can generate waypoints or assist in tasks like vision-and-language navigation (VLN), where robots follow verbal instructions using visual cues. Some researchers believe that LLMs can outperform A* and another standard algorithm for path planning because they are more capable of producing flexible, creative solutions. However, LLMs are usually not very versatile in handling new environments or highly complex scenarios without extensive fine-tuning. Additionally, most studies on LLMs in path planning have been executed in very simplified simulation environments and do not necessarily reflect the challenges encountered when using these models in real applications.

To address these gaps, researchers from Duke University and George Mason University conducted an experiment by testing three LLMs in six real-world path-planning scenarios in various settings and with multiple difficulties to determine their effectiveness in vision-and-language navigation. 

Different scenarios involved creating step-by-step directions to reach destinations, sometimes within time constraints. The study assessed LLMs in two tasks: Turn-by-Turn (TbT) Navigation, providing step-by-step directions in urban, suburban, and rural settings, and Vision-and-Language Navigation (VLN), guiding users with visual landmarks. The scenarios ranged in difficulty, with GPT-4 swarming around time-specific TbT prompts and Gemini requiring follow-ups for detailed VLN guidance. Three LLMs -PT -4, Gemini, and Mistral 7B-were tested across these tasks to assess their real-world path-planning capabilities.

The study evaluated LLMs by comparing their navigation routes to Waze’s ground truth and identifying major and minor errors. Major errors included route discontinuities, incorrect directions, and missed exits, while minor errors were smaller misdirections. In Turn-by-Turn (TbT) navigation, LLMs often had route gaps or provided wrong directions. For Vision-and-Language Navigation (VLN), models struggled with missing segments, wrong landmarks, or failing to reach destinations. Time constraints tests showed that GPT-4 excelled in these cases, the best in urban and suburban cases. Mistral excelled in urban navigation, GPT-4 in suburban and rural areas, and Gemini in VLN. In the end, it was discovered that all three models failed to consistently create an accurate route, which showed that they struggled with tasks that required spatial understanding.

In summary, this research demonstrated that tested LLMs are unfit for real-world navigation. GPT-4 performed slightly better in Turn-by-Turn (TbT) scenarios, while Gemini was better in Vision-and-Language Navigation (VLN), but all the models made errors. Therefore, these LLMs are unreliable for directing vehicle navigation, and car companies should be cautious about using them. In the future, this work can help design LLMs specifically for this task to integrate this great technology in vehicles and navigation!


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 60k+ ML SubReddit.

[Must Attend Webinar]: ‘Transform proofs-of-concept into production-ready AI applications and agents’ (Promoted)

The post Are LLMs Ready for Real-World Path Planning? A Critical Evaluation appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLMs 路径规划 车辆导航 实验评估
相关文章