An Initial Study of Bird's-Eye View Generation for Autonomous Vehicles using Cross-View Transformers

cs.AI updates on arXiv.org 13小时前

本文探讨了使用Cross-View Transformers（CVT）将摄像头图像映射到BEV地图的三通道——道路、车道标记和规划轨迹。研究通过模拟器训练CVT，考察了其对未见城镇的泛化能力、不同摄像头布局的影响以及两种损失函数（焦点和L1）。结果表明，CVT在生成合理准确的BEV地图方面具有巨大潜力。

arXiv:2508.12520v1 Announce Type: cross Abstract: Bird's-Eye View (BEV) maps provide a structured, top-down abstraction that is crucial for autonomous-driving perception. In this work, we employ Cross-View Transformers (CVT) for learning to map camera images to three BEV's channels - road, lane markings, and planned trajectory - using a realistic simulator for urban driving. Our study examines generalization to unseen towns, the effect of different camera layouts, and two loss formulations (focal and L1). Using training data from only a town, a four-camera CVT trained with the L1 loss delivers the most robust test performance, evaluated in a new town. Overall, our results underscore CVT's promise for mapping camera inputs to reasonably accurate BEV maps.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签