热点
"MLLMs" 相关文章
From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model
cs.AI updates on arXiv.org 2025-08-05T11:28:49.000000Z
EH-Benchmark Ophthalmic Hallucination Benchmark and Agent-Driven Top-Down Traceable Reasoning Workflow
cs.AI updates on arXiv.org 2025-08-05T11:10:26.000000Z
Multimodal Large Language Models for End-to-End Affective Computing: Benchmarking and Boosting with Generative Knowledge Prompting
cs.AI updates on arXiv.org 2025-08-05T11:10:21.000000Z
FairReason: Balancing Reasoning and Social Bias in MLLMs
cs.AI updates on arXiv.org 2025-08-01T04:08:13.000000Z
HiProbe-VAD: Video Anomaly Detection via Hidden States Probing in Tuning-Free Multimodal LLMs
cs.AI updates on arXiv.org 2025-07-24T05:31:20.000000Z
Pixels, Patterns, but No Poetry: To See The World like Humans
cs.AI updates on arXiv.org 2025-07-24T05:31:03.000000Z
CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination Generation
cs.AI updates on arXiv.org 2025-07-22T04:34:45.000000Z
Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification
cs.AI updates on arXiv.org 2025-07-22T04:34:26.000000Z
Automating Steering for Safe Multimodal Large Language Models
cs.AI updates on arXiv.org 2025-07-18T04:13:55.000000Z
Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification
cs.AI updates on arXiv.org 2025-07-17T04:14:12.000000Z
Warehouse Spatial Question Answering with LLM Agent
cs.AI updates on arXiv.org 2025-07-16T04:28:56.000000Z
Prompt4Trust: A Reinforcement Learning Prompt Augmentation Framework for Clinically-Aligned Confidence Calibration in Multimodal Large Language Models
cs.AI updates on arXiv.org 2025-07-15T04:24:38.000000Z
PyVision: Agentic Vision with Dynamic Tooling
cs.AI updates on arXiv.org 2025-07-11T04:04:21.000000Z
Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders
cs.AI updates on arXiv.org 2025-07-08T05:54:14.000000Z
Enhancing Sports Strategy with Video Analytics and Data Mining: Assessing the effectiveness of Multimodal LLMs in tennis video analysis
cs.AI updates on arXiv.org 2025-07-08T04:34:01.000000Z
HV-MMBench: Benchmarking MLLMs for Human-Centric Video Understanding
cs.AI updates on arXiv.org 2025-07-08T04:33:50.000000Z
PathCoT: Chain-of-Thought Prompting for Zero-shot Pathology Visual Reasoning
cs.AI updates on arXiv.org 2025-07-03T04:07:17.000000Z
中国科学院科学家首次证实:大语言模型能像人类一样“理解”事物
IT之家 2025-06-11T01:38:31.000000Z
ICML 2025 Spotlight | 多模态大模型暴露短板?EMMA基准深度揭秘多模态推理能力
机器之心 2025-05-20T06:50:21.000000Z
GPT-4o不敌Qwen,无一模型及格!UC伯克利/港大等联合团队提出多模态新基准:考察多视图理解能力
智源社区 2025-05-16T05:03:38.000000Z