热点
关于我们
xx
xx
"
多模态LLM
" 相关文章
Zero-Shot Document Understanding using Pseudo Table of Contents-Guided Retrieval-Augmented Generation
cs.AI updates on arXiv.org
2025-08-01T04:08:31.000000Z
SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting
cs.AI updates on arXiv.org
2025-07-23T04:03:05.000000Z
LLMs are stuck in Plato's cave
少点错误
2025-07-13T20:37:32.000000Z
MindFlow: Revolutionizing E-commerce Customer Support with Multimodal LLM Agents
cs.AI updates on arXiv.org
2025-07-09T04:01:39.000000Z
BlueLM-2.5-3B Technical Report
cs.AI updates on arXiv.org
2025-07-09T04:01:30.000000Z
This AI Paper Introduces WINGS: A Dual-Learner Architecture to Prevent Text-Only Forgetting in Multimodal Large Language Models
MarkTechPost@AI
2025-06-21T21:59:07.000000Z
字节跳动&清华大学开源多模态时序大模型ChatTS,可实现时序数据对话与推理
机器之心
2025-05-23T07:00:20.000000Z
中科院领衔万字长文,全面系统梳理多模态LLM对齐算法
量子位
2025-04-09T10:03:35.000000Z
推理延展到真实物理世界,英伟达Cosmos-Reason1:8B具身推理表现超过OpenAI ο1
机器之心
2025-03-25T06:50:22.000000Z
院士领衔万字长文,全面系统梳理多模态LLM对齐算法
智源社区
2025-03-24T16:51:11.000000Z
中科院领衔万字长文,全面系统梳理多模态LLM对齐算法
量子位
2025-03-24T10:22:18.000000Z
The Challenge of Captioning Video at More Than 1fps
Unite.AI
2025-03-20T05:15:01.000000Z
Patronus AI Introduces the Industry’s First Multimodal LLM-as-a-Judge (MLLM-as-a-Judge): Designed to Evaluate and Optimize AI Systems that Convert Image Inputs into Text Outputs
MarkTechPost@AI
2025-03-15T03:56:39.000000Z
阿里开源R1-Omni,DeepSeek同款RLVR首度结合全模态情感识别,网友:可解释性+多模态学习=下一代AI
智源社区
2025-03-12T11:00:03.000000Z
Kimi k1.5: A Next Generation Multi-Modal LLM Trained with Reinforcement Learning on Advancing AI with Scalable Multimodal Reasoning and Benchmark Excellence
MarkTechPost@AI
2025-01-23T06:04:22.000000Z
微软开源Markdown工具爆了:支持Office文档,可接多模态LLM直出报告
量子位
2025-01-21T17:09:43.000000Z
微软开源Markdown工具爆了:支持Office文档,可接多模态LLM直出报告
智源社区
2025-01-21T08:07:17.000000Z
罗氏|LAB IN A LOOP:利用数据和人工智能改变药物发现和开发
智源社区
2025-01-19T04:07:16.000000Z
Democratizing AI: Implementing a Multimodal LLM-Based Multi-Agent System with No-Code Platforms for Business Automation
MarkTechPost@AI
2025-01-10T22:31:39.000000Z
ScreenSpot-Pro: The First Benchmark Driving Multi-Modal LLMs into High-Resolution Professional GUI-Agent and Computer-Use Environments
MarkTechPost@AI
2025-01-05T20:04:55.000000Z