热点
关于我们
xx
xx
"
大型多模态模型
" 相关文章
Unveiling Effective In-Context Configurations for Image Captioning: An External & Internal Analysis
cs.AI updates on arXiv.org
2025-07-14T04:08:24.000000Z
LinguaMark: Do Multimodal Models Speak Fairly? A Benchmark-Based Evaluation
cs.AI updates on arXiv.org
2025-07-11T04:04:05.000000Z
PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning
cs.AI updates on arXiv.org
2025-07-03T04:07:25.000000Z
MMSearch-R1: End-to-End Reinforcement Learning for Active Image Search in LMMs
MarkTechPost@AI
2025-04-07T04:08:41.000000Z
Salesforce AI Research Introduce xGen-MM (BLIP-3): A Scalable AI Framework for Advancing Large Multimodal Models with Enhanced Training and Performance Capabilities
MarkTechPost@AI
2024-08-19T22:04:54.000000Z
MINT-1T Dataset Released: A Multimodal Dataset with One Trillion Tokens to Build Large Multimodal Models
MarkTechPost@AI
2024-07-26T12:04:20.000000Z
Visual Haystacks Benchmark: The First “Visual-Centric” Needle-In-A-Haystack (NIAH) Benchmark to Assess LMMs’ Capability in Long-Context Visual Retrieval and Reasoning
MarkTechPost@AI
2024-07-24T07:19:20.000000Z
LLaVA-NeXT-Interleave: A Versatile Large Multimodal Model LMM that can Handle Settings like Multi-image, Multi-frame, and Multi-view
MarkTechPost@AI
2024-07-13T16:46:13.000000Z
LongVA and the Impact of Long Context Transfer in Visual Processing: Enhancing Large Multimodal Models for Long Video Sequences
MarkTechPost@AI
2024-06-29T07:01:45.000000Z