2025.07.10 | 零样本运动生成突破；4K图像超分辨率提升。

本期的 14 篇论文如下：

00:22 🤸 Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data（趋向于零：基于百万级数据的零样本运动生成）

01:03 🖼 4KAgent: Agentic Any Image to 4K Super-Resolution（4KAgent：将任意图像转化为4K超分辨率的智能体系统）

01:39 🖼 Perception-Aware Policy Optimization for Multimodal Reasoning（多模态推理的感知感知策略优化）

02:24 🧪 Rethinking Verification for LLM Code Generation: From Generation to Testing（重新思考LLM代码生成的验证：从生成到测试）

03:05 🤔 A Systematic Analysis of Hybrid Linear Attention（混合线性注意力机制的系统性分析）

03:42 🧠 First Return, Entropy-Eliciting Explore（首次回报，熵驱动探索）

04:23 🤖 AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMs（AutoTriton：基于大型语言模型中强化学习的自动Triton编程）

05:05 🧩 Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving（通过解耦推理与证明来解决更具挑战性的国际数学奥林匹克竞赛题）

05:47 🚗 A Survey on Vision-Language-Action Models for Autonomous Driving（面向自动驾驶的视觉-语言-动作模型综述）

06:29 🧪 DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models（DiffSpectra：使用扩散模型从光谱中解析分子结构）

07:09 🗣 ModelCitizens: Representing Community Voices in Online Safety（模范公民：在线安全中代表社区的声音）

07:50 🤖 SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning（SRT-H：基于语言条件模仿学习的自主手术分层框架）

08:32 🔬 Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework（基于前沿模型安全框架评估亚马逊Nova Premier的关键风险）

09:21 🧐 AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness（AdamMeme：自适应地探查多模态大型语言模型在有害性上的推理能力）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

Fish AI Reader