热点
"多模态大型语言模型" 相关文章
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
cs.AI updates on arXiv.org 2025-07-30T04:46:14.000000Z
T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation
cs.AI updates on arXiv.org 2025-07-29T04:22:26.000000Z
DOGR: Towards Versatile Visual Document Grounding and Referring
cs.AI updates on arXiv.org 2025-07-22T04:34:00.000000Z
A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends
cs.AI updates on arXiv.org 2025-07-15T04:26:56.000000Z
Advancing MLLM Alignment Through MM-RLHF: A Large-Scale Human Preference Dataset for Multimodal Tasks
MarkTechPost@AI 2025-02-19T18:33:56.000000Z
Meta AI Releases LongVU: A Multimodal Large Language Model that can Address the Significant Challenge of Long Video Understanding
MarkTechPost@AI 2024-10-30T22:50:09.000000Z
SafeBench:多模态大模型安全评估框架,揭示MLLM安全隐患
MIT 科技评论 - 本周热榜 2024-10-28T02:45:33.000000Z
Nature Methods特刊评论:用人工智能之“钥”,开空间组学之“锁”
集智俱乐部 2024-08-25T04:25:11.000000Z
ProcTag: A Data-Oriented AI Method that Assesses the Efficacy of Document Instruction Data
MarkTechPost@AI 2024-07-23T09:18:50.000000Z
CharXiv: A Comprehensive Evaluation Suite Advancing Multimodal Large Language Models Through Realistic Chart Understanding Benchmarks
MarkTechPost@AI 2024-06-29T04:01:35.000000Z