本期的 15 篇论文如下:
00:24 💡 Light of Normals: Unified Feature Representation for Universal Photometric Stereo(法线光照:用于通用光度立体的统一特征表示)
01:00 🎨 OmniGen2: Exploration to Advanced Multimodal Generation(OmniGen2:迈向更高级的多模态生成探索)
01:39 ✍ LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning(LongWriter-Zero:通过强化学习掌握超长文本生成)
02:17 🎭 Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset(幻影数据:面向通用主题一致性视频生成数据集)
02:58 🧠 RLPR: Extrapolating RLVR to General Domains without Verifiers(RLPR:将RLVR推广到无验证器的一般领域)
03:36 🧠 ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs(ReasonFlux-PRM:LLM中用于长链思维推理的轨迹感知PRM)
04:11 🤖 OAgents: An Empirical Study of Building Effective Agents(OAgents:构建有效智能体的实证研究)
04:52 🖼 Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations(视觉即方言:通过文本对齐表征统一视觉理解与生成)
05:31 🎬 VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory(VMem:基于Surfel索引视图记忆的交互式一致视频场景生成)
06:06 🧑 LettinGo: Explore User Profile Generation for Recommendation System(LettinGo:探索用于推荐系统的用户画像生成)
06:48 🔀 ReDit: Reward Dithering for Improved LLM Policy Optimization(ReDit:通过奖励抖动改进LLM策略优化)
07:29 💡 FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning(FinCoT:将思维链扎根于专家金融推理)
08:08 🎬 ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs(ViDAR:基于视频扩散的单目输入四维重建)
08:47 🖼 Auto-Regressively Generating Multi-View Consistent Images(自回归生成多视角一致性图像)
09:35 💡 SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation(SlimMoE:通过专家精简和知识蒸馏实现大型MoE模型的结构化压缩)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递