cs.AI updates on arXiv.org 07月08日 12:33
Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了一种新型多模态大语言模型(MLLM)集成语义通信框架MLLM-SC,旨在解决6G网络在资源受限的无线通信系统中的挑战,并验证了其在AR/VR应用和图像生成中的有效性。

arXiv:2507.04621v1 Announce Type: cross Abstract: 6G networks promise revolutionary immersive communication experiences including augmented reality (AR), virtual reality (VR), and holographic communications. These applications demand high-dimensional multimodal data transmission and intelligent data processing in real-time, which is extremely challenging over resource-limited wireless communication systems. Moreover, a joint understanding of the environment, context, and user intent is essential to deliver task-relevant content effectively. This article presents a novel multimodal large language model (MLLM) integrated semantic communications framework, termed MLLM-SC, which fully leverages reasoning and generative capabilities of pre-trained foundation models for context-aware and task-oriented wireless communication. The MLLM-SC framework adopts a device-edge collaborative architecture. At the edge, MLLM-empowered semantic guidance module analyzes multimodal inputs, user intents, and channel conditions to generate importance-aware attention maps prioritizing semantically critical information. An importance-aware semantic encoder and a resource-adaptive semantic decoder are jointly designed and optimized, which can utilize the semantic guidance for adaptive bandwidth allocation and high-quality content reconstruction or generation. Extensive case studies on visual question answering for AR/VR applications and diffusion-driven image generation validate the effectiveness of MLLM-SC.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

6G网络 语义通信 MLLM
相关文章