视频大语言模型_Fishai

热点

"视频大语言模型" 相关文章

VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding

cs.AI updates on arXiv.org 2025-07-18T04:13:59.000000Z

This AI Paper Introduces DyCoke: Dynamic Token Compression for Efficient and High-Performance Video Large Language Models

MarkTechPost@AI 2024-11-28T12:04:54.000000Z

VideoLLaMA 2 Released: A Set of Video Large Language Models Designed to Advance Multimodal Research in the Arena of Video-Language Modeling

MarkTechPost@AI 2024-08-15T08:19:57.000000Z

Copyright © 2019 FISHAI.All Rights Reserved