LLaVA_Fishai

热点

"LLaVA" 相关文章

ICCV 2025 | 视觉Token跳起来！上交大×蚂蚁联手推出多模态通用加速框架

PaperWeekly 2025-07-14T00:18:59.000000Z

ICCV 2025 | 视觉Token跳起来！上交大×蚂蚁联手推出多模态通用加速框架

PaperWeekly 2025-07-10T15:37:47.000000Z

揭秘千卡 GPU 集群如何高效训练多模态大模型：vivo AI 团队实战经验分享｜AICon

AI前线 2025-06-18T09:07:25.000000Z

统一SAM2和LLaVA！字节豆包提出Dense Video多模态大模型Sa2VA

机器之心 2025-02-12T07:53:19.000000Z

Sa2VA: A Unified AI Framework for Dense Grounded Video and Image Understanding through SAM-2 and LLaVA Integration

MarkTechPost@AI 2025-01-12T19:35:04.000000Z

Are SAE features from the Base Model still meaningful to LLaVA?

少点错误 2024-12-05T21:02:28.000000Z

Are SAE features from the Base Model still meaningful to LLaVA?

少点错误 2024-12-05T21:02:28.000000Z

12%计算量就能媲美原模型，Adobe、罗切斯特大学等提出YOPO剪枝技术

机器之心 2024-11-28T05:54:25.000000Z

UGround: A Universal GUI Visual Grounding Model Developed with Large-Scale Web-based Synthetic Data

MarkTechPost@AI 2024-10-12T06:49:50.000000Z

LLaVaOLMoBitnet1B: The First Ternary Multimodal LLM Capable of Accepting Image(s) and Text Inputs to Produce Coherent Textual Response

MarkTechPost@AI 2024-09-03T09:20:13.000000Z

SF-LLaVA: A Training-Free Video LLM that is Built Upon LLaVA-NeXT and Requires No Additional Fine-Tuning to Work Effectively for Various Video Tasks

MarkTechPost@AI 2024-07-25T08:19:17.000000Z

Math-LLaVA: A LLaVA-1.5-based AI Model Fine-Tuned with MathV360K Dataset

MarkTechPost@AI 2024-07-01T12:01:47.000000Z

LLaVA-NeXT: Advancements in Multimodal Understanding and Video Comprehension

MarkTechPost@AI 2024-05-15T04:31:00.000000Z

Copyright © 2019 FISHAI.All Rights Reserved