2025.03.13 | 降低视频扩散模型计算需求，提升多视角视频生成质量。

本期的 15 篇论文如下：

[00:20] ? TPDiff: Temporal Pyramid Video Diffusion Model（TPDiff：时间金字塔视频扩散模型）

[00:58] ? Reangle-A-Video: 4D Video Generation as Video-to-Video Translation（Reangle-A-Video：将4D视频生成作为视频到视频的转换）

[01:42] ? Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models（块扩散：在自回归与扩散语言模型之间插值）

[02:18] ? RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling（RewardSDS：通过奖励加权采样对齐分数蒸馏）

[02:55] ? GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training（GTR：引导思维强化防止基于RL的VLM代理训练中的思维崩溃）

[03:36] ? More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG（更多文档，相同长度：隔离RAG中多文档的挑战）

[04:19] ? Motion Anything: Any to Motion Generation（运动万象：任意到运动生成）

[05:15] ? WildIFEval: Instruction Following in the Wild（野外交互评估：复杂条件下的指令遵循）

[05:49] ? VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary（VLog：通过生成性检索叙事词汇的视频-语言模型）

[06:29] ? Quantizing Large Language Models for Code Generation: A Differentiated Replication（量化大型语言模型用于代码生成：差异化复现）

[07:13] ? Cost-Optimal Grouped-Query Attention for Long-Context LLMs（长上下文大语言模型的成本最优分组查询注意力）

[07:53] ? Multimodal Language Modeling for High-Accuracy Single Cell Transcriptomics Analysis and Generation（高精度单细胞转录组分析与生成中的多模态语言建模）

[08:33] ? Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space（无别名潜在扩散模型：提升扩散潜在空间的分数位移等变性）

[09:15] ? Self-Taught Self-Correction for Small Language Models（小语言模型的自教自纠）

[09:49] ? MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System（MoC：检索增强生成系统中的文本分块学习混合模型）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

Fish AI Reader