热点
"视觉语言模型" 相关文章
开发迄今最大的遥感指令数据集,IBM研究院等提出专为地球观测数据设计的VLM,入选CVPR 2025
智源社区 2025-04-28T08:37:56.000000Z
7B超越GPT!1/20数据,无需知识蒸馏,马里兰等推出全新视觉推理方法
智源社区 2025-04-25T07:07:52.000000Z
Microsoft Research Introduces MMInference to Accelerate Pre-filling for Long-Context Vision-Language Models
MarkTechPost@AI 2025-04-25T06:30:37.000000Z
Kimi开源MoE架构多模态推理模型,小激活参数,大能量!
魔搭ModelScope社区 2025-04-19T06:12:52.000000Z
Meta AI Released the Perception Language Model (PLM): An Open and Reproducible Vision-Language Model to Tackle Challenging Visual Recognition Tasks
MarkTechPost@AI 2025-04-19T00:35:33.000000Z
Do We Still Need Complex Vision-Language Pipelines? Researchers from ByteDance and WHU Introduce Pixel-SAIL—A Single Transformer Model for Pixel-Level Understanding That Outperforms 7B MLLMs
MarkTechPost@AI 2025-04-17T17:15:33.000000Z
CVPR 2025 | MAE损失+最优传输双剑合璧!上科大提出全新鲁棒提示学习方法
PaperWeekly 2025-04-16T13:17:44.000000Z
中科大、中兴提出新后训练范式:小尺寸多模态模型,成功复现R1推理
机器之心 2025-04-14T08:36:03.000000Z
Moonsight AI Released Kimi-VL: A Compact and Powerful Vision-Language Model Series Redefining Multimodal Reasoning, Long-Context Understanding, and High-Resolution Visual Processing
MarkTechPost@AI 2025-04-12T03:45:32.000000Z
Kimi-VL:视觉语言模型(VLM)的新探索
月之暗面 Kimi 2025-04-11T16:15:03.000000Z
Kimi 16B胜GPT-4o!开源视觉推理模型:MoE架构,推理时仅激活2.8B
智源社区 2025-04-11T08:50:56.000000Z
UCLA Researchers Released OpenVLThinker-7B: A Reinforcement Learning Driven Model for Enhancing Complex Visual Reasoning and Step-by-Step Problem Solving in Multimodal Systems
MarkTechPost@AI 2025-03-29T04:50:43.000000Z
专抓AI“看图说谎”,谷歌哥大用三类陷阱触发幻觉,打造可随技术发展动态演进的评估框架
36kr-科技 2025-03-28T13:02:13.000000Z
32B本地部署!阿里开源最新多模态模型:主打视觉语言,数学推理也很强
智源社区 2025-03-26T15:13:51.000000Z
32B本地部署!阿里开源最新多模态模型:主打视觉语言,数学推理也很强
量子位 2025-03-25T12:11:48.000000Z
Qwen Releases the Qwen2.5-VL-32B-Instruct: A 32B Parameter VLM that Surpasses Qwen2.5-VL-72B and Other Models like GPT-4o Mini
MarkTechPost@AI 2025-03-25T05:47:29.000000Z
This AI Paper from NVIDIA Introduces Cosmos-Reason1: A Multimodal Model for Physical Common Sense and Embodied Reasoning
MarkTechPost@AI 2025-03-25T04:05:28.000000Z
2025.03.24 | 多智能体协作提升性能,苏格拉底式对话优化提示。
HuggingFace 每日AI论文速递 2025-03-24T23:02:36.000000Z
50条数据解锁空间智能,RL视觉语言模型3D空间推理框架MetaSpatial |西北大学
智源社区 2025-03-23T08:24:12.000000Z
50条数据解锁空间智能,RL视觉语言模型3D空间推理框架MetaSpatial |西北大学
量子位 2025-03-22T10:19:39.000000Z