Multimodal Video Emotion Recognition with Reliable Reasoning Priors

cs.AI updates on arXiv.org 6小时前

Multimodal Video Emotion Recognition with Reliable Reasoning Priors

研究将MLLM的可靠先验推理知识整合到多模态情感识别中，通过Gemini生成推理痕迹，并引入平衡双对比学习，提升情感识别性能。

arXiv:2508.03722v1 Announce Type: cross Abstract: This study investigates the integration of trustworthy prior reasoning knowledge from MLLMs into multimodal emotion recognition. We employ Gemini to generate fine-grained, modality-separable reasoning traces, which are injected as priors during the fusion stage to enrich cross-modal interactions. To mitigate the pronounced class-imbalance in multimodal emotion recognition, we introduce Balanced Dual-Contrastive Learning, a loss formulation that jointly balances inter-class and intra-class distributions. Applied to the MER2024 benchmark, our prior-enhanced framework yields substantial performance gains, demonstrating that the reliability of MLLM-derived reasoning can be synergistically combined with the domain adaptability of lightweight fusion networks for robust, scalable emotion recognition.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

MLLM 多模态情感识别推理知识融合网络性能提升

相关文章

Show HN: 开源 LLM 补丁流 - 速度和输出令牌改进

Rivian 更新 R1，采用新型电机和电池组，提高了性能，降低了成本

Solana: ↩️ @vohvohh

Intel：正式发布第二代酷睿Ultra处理器架构

重要科學運算函式庫NumPy經多年開發迎來2.0重大更新

号称提升100倍的CPU设计，真相究竟是什么

苹果 iOS 18 助力 iPhone 15 Pro Max 机器学习测试得分提高 25%

Salesforce AI Unveils SFR-Embedding-v2: Reclaiming Top Spot on HuggingFace MTEB Benchmark with Advanced Multitasking and Enhanced Performance in AI

零下78℃全网首发！“骁龙8Gen2”极限超频49%！能干翻8Gen3？甚至比肩M1吗？【小鹏HiTech】

探秘华为 HDC2024！原生鸿蒙到底怎么样？？？