热点
"视觉语言" 相关文章
VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks
cs.AI updates on arXiv.org 2025-07-25T04:28:47.000000Z
2025.07.23 | TIM模型突破LLM上下文限制;Step-Audio 2提升多模态语音对话。
HuggingFace 每日AI论文速递 2025-07-24T00:07:02.000000Z
ICCV 2025满分论文:一个模型实现空间理解与主动探索大统一
机器之心 2025-07-14T06:44:59.000000Z
科学家揭示模型内部的信息流动层级结构,可用于提升多模态AI系统透明性
MIT 科技评论 - 本周热榜 2025-06-29T16:12:35.000000Z
人形机器人首次打通视觉感知与运动断层,UC伯克利华人博士让宇树G1现场演示
量子位 2025-06-25T06:45:44.000000Z
2025.06.03 | 高熵Token提升LLM推理;推理健身房优化强化学习环境。
HuggingFace 每日AI论文速递 2025-06-03T23:12:55.000000Z
【征稿&挑战赛】ACM MM 2025 第一届 “软体机器人视觉语言” 研讨会&挑战赛
我爱计算机视觉 2025-06-03T13:52:22.000000Z
Advancing Vision-Language Reward Models: Challenges, Benchmarks, and the Role of Process-Supervised Learning
MarkTechPost@AI 2025-04-03T07:25:28.000000Z
阿里开源最新多模态模型 Qwen2.5-VL-32B:主打视觉语言,数学推理也很强
IT之家 2025-03-25T01:28:12.000000Z
This AI Paper from UC Berkeley Introduces TULIP: A Unified Contrastive Learning Model for High-Fidelity Vision and Language Understanding
MarkTechPost@AI 2025-03-24T05:05:26.000000Z
TNNLS24|动态网络!同一个模型走不同路径,就能生成不同的图像描述结果!
我爱计算机视觉 2024-11-14T12:11:04.000000Z
SAM、CLIP...最近有哪些基于 RWKV 的多模态等研究?【第二期】
RWKV元始智能 2024-10-28T00:09:59.000000Z
EMOVA: A Novel Omni-Modal LLM for Seamless Integration of Vision, Language, and Speech
MarkTechPost@AI 2024-10-05T17:35:52.000000Z
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models (LMMs) for Integrated Capabilities
MarkTechPost@AI 2024-08-09T18:04:49.000000Z
横构图是对竖屏时代的反抗吗?
36kr 2024-07-12T12:33:45.000000Z
Microsoft Releases Florence-2: A Novel Vision Foundation Model with a Unified, Prompt-based Representation for a Variety of Computer Vision and Vision-Language Tasks
MarkTechPost@AI 2024-06-21T17:31:47.000000Z
Learning Visiolinguistic Representations with ViLBERT w/ Stefan Lee - #358
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) 2024-05-12T03:32:27.000000Z