Conditional Video Generation for High-Efficiency Video Compression

cs.AI updates on arXiv.org 07月22日 12:44

Conditional Video Generation for High-Efficiency Video Compression

本文提出一种利用条件扩散模型进行感知优化重建的视频压缩框架，通过多粒度条件、紧凑表示和多条件训练等模块，显著提升视频压缩感知质量。

arXiv:2507.15269v1 Announce Type: cross Abstract: Perceptual studies demonstrate that conditional diffusion models excel at reconstructing video content aligned with human visual perception. Building on this insight, we propose a video compression framework that leverages conditional diffusion models for perceptually optimized reconstruction. Specifically, we reframe video compression as a conditional generation task, where a generative model synthesizes video from sparse, yet informative signals. Our approach introduces three key modules: (1) Multi-granular conditioning that captures both static scene structure and dynamic spatio-temporal cues; (2) Compact representations designed for efficient transmission without sacrificing semantic richness; (3) Multi-condition training with modality dropout and role-aware embeddings, which prevent over-reliance on any single modality and enhance robustness. Extensive experiments show that our method significantly outperforms both traditional and neural codecs on perceptual quality metrics such as Fr\'echet Video Distance (FVD) and LPIPS, especially under high compression ratios.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

视频压缩条件扩散模型感知优化压缩比

相关文章

Natural Graph Networks with Taco Cohen - #440

Make-An-Agent: A Novel Policy Parameter Generator that Leverages the Power of Conditional Diffusion Models for Behavior-to-Policy Generation

入选ACL 2024！引入零样本学习，华中科大发布针对甲骨文破译优化的条件扩散模型

1GB的视频每一帧都截图下来，照片有多少GB？

CogVideoX，开源了！

微信或将迎史诗级“瘦身”！网友：内存有救了。据潇湘晨报，近日，微信正式发布了8.0.54版本更新，其中“原图、原视频14天自动清理”功能备受关注，不少手机存储...

ByePhotos – 找出 iPhone 相册中的重复照片，还支持压缩视频[限免中]

信不信，你的手机里至少藏了3.7GB的重复照片｜一个限免

港科大开源VideoVAE+，视频重建质量全面超越最新模型