E3RG: Building Explicit Emotion-driven Empathetic Response Generation System with Multimodal Large Language Model

cs.AI updates on arXiv.org 13小时前

E3RG: Building Explicit Emotion-driven Empathetic Response Generation System with Multimodal Large Language Model

本文提出E3RG系统，利用多模态LLMs处理情感内容，实现自然、丰富的情感响应，在多模态情感挑战赛中取得优异成绩。

arXiv:2508.12854v1 Announce Type: new Abstract: Multimodal Empathetic Response Generation (MERG) is crucial for building emotionally intelligent human-computer interactions. Although large language models (LLMs) have improved text-based ERG, challenges remain in handling multimodal emotional content and maintaining identity consistency. Thus, we propose E3RG, an Explicit Emotion-driven Empathetic Response Generation System based on multimodal LLMs which decomposes MERG task into three parts: multimodal empathy understanding, empathy memory retrieval, and multimodal response generation. By integrating advanced expressive speech and video generative models, E3RG delivers natural, emotionally rich, and identity-consistent responses without extra training. Experiments validate the superiority of our system on both zero-shot and few-shot settings, securing Top-1 position in the Avatar-based Multimodal Empathy Challenge on ACM MM 25. Our code is available at https://github.com/RH-Lin/E3RG.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

多模态情感响应生成 E3RG系统情感识别

相关文章

Enhancing Customer Experiences With Emotional AI with Rana El Kaliouby - TWiML Talk #35

Softbank’s New AI customer Service Solution

New model allows a computer to understand human emotions

Making life friendlier with personal robots

新项目！开源AI语音天花板！ 3秒样本完美模仿音色，真实情感，媲美真人

FunAudioLLM: A Multi-Model Framework for Natural, Multilingual, and Emotionally Expressive Voice Interactions

ChatGPT版Her被玩疯：哭着读诗中文表现也很亮

ChatGPT版「Her」被玩疯：哭着读诗，中文表现也很亮

科大讯飞推出“星火极速超拟人交互”：可模仿孙悟空、蜡笔小新、小猪佩奇等音色、语气、人设

Does Artificial Intelligence Have Feelings?