Transformer-based Spatial Grounding: A Comprehensive Survey

cs.AI updates on arXiv.org 07月18日 12:13

Transformer-based Spatial Grounding: A Comprehensive Survey

本文对2018年至2025年间基于Transformer模型的空间定位方法进行了系统文献综述，分析了主流模型架构、常用数据集和评估指标，并总结了关键方法和最佳实践，为研究者与实践者提供了重要的参考和指导。

arXiv:2507.12739v1 Announce Type: cross Abstract: Spatial grounding, the process of associating natural language expressions with corresponding image regions, has rapidly advanced due to the introduction of transformer-based models, significantly enhancing multimodal representation and cross-modal alignment. Despite this progress, the field lacks a comprehensive synthesis of current methodologies, dataset usage, evaluation metrics, and industrial applicability. This paper presents a systematic literature review of transformer-based spatial grounding approaches from 2018 to 2025. Our analysis identifies dominant model architectures, prevalent datasets, and widely adopted evaluation metrics, alongside highlighting key methodological trends and best practices. This study provides essential insights and structured guidance for researchers and practitioners, facilitating the development of robust, reliable, and industry-ready transformer-based spatial grounding models.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Transformer 空间定位模型架构数据集评估指标

相关文章

Import AI 364: Robot scaling laws; human-level LLM forecasting; and Claude 3

MS MARCO Web Search: A Large-Scale Information-Rich Web Dataset Featuring Millions of Real Clicked Query-Document Labels

AI Trends 2024: Machine Learning & Deep Learning with Thomas Dietterich - #666

Trends in Computer Vision with Georgia Gkioxari - #549

Social Commonsense Reasoning with Yejin Choi - #518

Trends in Natural Language Processing with Sameer Singh - #445

Metric Elicitation and Robust Distributed Learning with Sanmi Koyejo - #352

This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars

AI趨勢周報第252期：取代Transformer？LSTM之父發表新LLM架構

How ‘Chain of Thought’ Makes Transformers Smarter