Application of Vision-Language Model to Pedestrians Behavior and Scene Understanding in Autonomous Driving

cs.AI updates on arXiv.org 07月31日 12:48

Application of Vision-Language Model to Pedestrians Behavior and Scene Understanding in Autonomous Driving

本文提出一种知识蒸馏方法，将大规模视觉语言模型知识迁移至高效视觉网络，应用于行人行为预测和场景理解，显著提升自动驾驶感知和决策能力。

arXiv:2501.06680v2 Announce Type: replace-cross Abstract: Vision-language models (VLMs) have become a promising approach to enhancing perception and decision-making in autonomous driving. The gap remains in applying VLMs to understand complex scenarios interacting with pedestrians and efficient vehicle deployment. In this paper, we propose a knowledge distillation method that transfers knowledge from large-scale vision-language foundation models to efficient vision networks, and we apply it to pedestrian behavior prediction and scene understanding tasks, achieving promising results in generating more diverse and comprehensive semantic attributes. We also utilize multiple pre-trained models and ensemble techniques to boost the model's performance. We further examined the effectiveness of the model after knowledge distillation; the results show significant metric improvements in open-vocabulary perception and trajectory prediction tasks, which can potentially enhance the end-to-end performance of autonomous driving.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

知识蒸馏视觉语言模型自动驾驶行人行为预测

相关文章

Tesla Announces Reduction in Subscription Fee of its FSD Driver-Assist Software

Top Important Computer Vision Papers for the Week from 29/04 to 05/05

New Arm Processors Boost Security for AI-enabled SDVs

Bringing AI Up to Speed with Autonomous Racing w/ Madhur Behl - #494

System Design for Autonomous Vehicles with Drago Anguelov - #454

Simulating the Future of Traffic with RL w/ Cathy Wu - #362

The Next Generation of Self-Driving Engineers with Aaron Ma - Talk #318

Perception Models for Self-Driving Cars with Jianxiong Xiao - TWiML Talk #58

This Week in ML & AI - 7/1/16: Fatal Tesla Autopilot Crash, EU Outlawing Machine Learning & CVPR

THRONE: Advancing the Evaluation of Hallucinations in Vision-Language Models