热点
"视觉-语言模型" 相关文章
复旦大学团队推出ParaCAD,首个包含尺寸信息标注的CAD参数化理解任务基准数据集!新范式PHT-CAD再创新SOTA!
我爱计算机视觉 2025-04-05T12:52:01.000000Z
复旦大学团队推出ParaCAD,首个包含尺寸信息标注的CAD参数化理解任务基准数据集!新范式PHT-CAD再创新SOTA!
我爱计算机视觉 2025-04-05T12:52:00.000000Z
精度飙升13.7%!复旦发布CAD参数化新基准,PHT-CAD框架精准解析工程图纸
PaperWeekly 2025-04-04T13:07:14.000000Z
专抓AI“看图说谎”,谷歌哥大用三类陷阱触发幻觉,打造可随技术发展动态演进的评估框架
智源社区 2025-03-29T11:14:55.000000Z
复旦大学团队推出ParaCAD,首个包含尺寸信息标注的CAD参数化理解任务基准数据集!新范式PHT-CAD再创新SOTA!
我爱计算机视觉 2025-03-27T14:11:50.000000Z
CoSyn: An AI Framework that Leverages the Coding Capabilities of Text-only Large Language Models (LLMs) to Automatically Create Synthetic Text-Rich Multimodal Data
MarkTechPost@AI 2025-02-26T04:48:35.000000Z
Convergence Releases Proxy Lite: A Mini, Open-Weights Version of Proxy Assistant Performing Pretty Well on UI Navigation Tasks
MarkTechPost@AI 2025-02-25T20:10:40.000000Z
Google DeepMind Research Releases SigLIP2: A Family of New Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
MarkTechPost@AI 2025-02-22T05:50:11.000000Z
仅缩小视觉Token位置编码间隔,轻松让多模态大模型理解百万Token!清华大学,香港大学,上海AI Lab新突破
机器之心 2025-01-15T05:47:52.000000Z
MMLongBench-Doc: A Comprehensive Benchmark for Evaluating Long-Context Document Understanding in Large Vision-Language Models
MarkTechPost@AI 2024-07-19T11:48:45.000000Z