热点
关于我们
xx
xx
"
视觉问答
" 相关文章
多模态大模型事实正确性评估:o1最强,模型普遍过于自信,最擅长现代建筑/工程技术/科学
智源社区
2025-02-24T07:37:16.000000Z
多模态大模型事实正确性评估:o1最强,模型普遍过于自信,最擅长现代建筑/工程技术/科学
量子位
2025-02-24T01:13:50.000000Z
Advancing Large Multimodal Models: DocHaystack, InfoHaystack, and the Vision-Centric Retrieval-Augmented Generation Framework
MarkTechPost@AI
2024-12-07T01:34:51.000000Z
北大、清华等提出LLaVA-o1,视觉语言模型中的o1来了!
PaperWeekly
2024-11-23T11:41:42.000000Z
Fine-tune multimodal models for vision and text use cases on Amazon SageMaker JumpStart
AWS Machine Learning Blog
2024-11-15T17:03:17.000000Z
Generalized Visual Language Models
Lil'Log
2024-11-09T05:43:41.000000Z
ChatGPT自学指南:宝藏参考书大盘点
智源社区
2024-07-30T07:07:01.000000Z
Visual Haystacks Benchmark: The First “Visual-Centric” Needle-In-A-Haystack (NIAH) Benchmark to Assess LMMs’ Capability in Long-Context Visual Retrieval and Reasoning
MarkTechPost@AI
2024-07-24T07:19:20.000000Z
Google DeepMind Unveils PaliGemma: A Versatile 3B Vision-Language Model VLM with Large-Scale Ambitions
MarkTechPost@AI
2024-07-12T11:16:31.000000Z
多模态大模型看懂图片也会答错,智源联合多家机构推出多模态模型鲁棒性测试基准
PaperAgent
2024-07-04T14:06:28.000000Z
Robust Visual Reasoning with Adriana Kovashka - #463
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
2024-05-12T03:02:26.000000Z