2025.02.18 | 稀疏注意力提升效率，机器人起身策略优化。

本期的 29 篇论文如下：

[00:23] ⚡ Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention（原生稀疏注意力：硬件对齐与原生可训练的稀疏注意力）

[01:10] ? Learning Getting-Up Policies for Real-World Humanoid Robots（学习真实世界人形机器人起身策略）

[01:55] ? ReLearn: Unlearning via Learning for Large Language Models（ReLearn：通过学习实现大型语言模型的遗忘）

[02:35] ? SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?（SWE-Lancer：前沿大语言模型能否从真实世界的自由软件工程中赚取100万美元？）

[03:21] ? HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation（赫尔墨斯流：无缝衔接多模态理解和生成）

[03:58] ? How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training（大型语言模型如何获取新知识？知识电路视角下的持续预训练）

[04:33] ? SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors（SURGE：关于大型语言模型作为通用代理代码执行器的潜力）

[05:12] ? Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening（扩散锐化：利用去噪轨迹锐化优化扩散模型微调）

[05:55] ? I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models（我思故我扩散：在扩散模型中实现多模态上下文推理）

[06:38] ? SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL（SAFE-SQL：基于细粒度示例选择的自增强上下文学习用于文本到SQL转换）

[07:25] ? CRANE: Reasoning with constrained LLM generation（CRANE：受限LLM生成的推理）

[08:07] ? Intuitive physics understanding emerges from self-supervised pretraining on natural videos（直觉物理理解从自然视频的自监督预训练中涌现）

[08:46] ? Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest（杜鹃：在大型语言模型的巢中孵化出的信息抽取搭便车者）

[09:22] ? Dyve: Thinking Fast and Slow for Dynamic Process Verification（Dyve：动态过程验证中的快思与慢想）

[10:06] ? PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning（物理推理：基于物理推理的综合基准）

[10:53] ? System Message Generation for User Preferences using Open-Source Models（基于开源模型的用户偏好系统消息生成）

[11:38] ? video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model（视频-SALMONN-o1：推理增强的音视频大型语言模型）

[12:33] ? Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity（构建一个在数据稀缺情况下比GPT-4o好64%的证明导向程序员）

[13:11] ? Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning（记忆、基准与机器人：一种用于强化学习解决复杂任务的基准）

[13:52] ? MagicArticulate: Make Your 3D Models Articulation-Ready（魔法清晰：让你的3D模型准备好关节动画）

[14:37] ? Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems（结构化交流，层次化行动：LLM多智能体系统的协作框架）

[15:21] ? One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs（一个示例展示，多个概念知晓！数学大语言模型中的反例驱动概念推理）

[16:03] ? Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model（单一模型能否同时掌握多轮对话与工具使用？CALM：一个统一的对话代理语言模型）

[16:40] ? Better Embeddings with Coupled Adam（结合Adam优化器的更好嵌入）

[17:18] ? Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking（展示工作：事实核查员对可解释自动化事实核查的需求）

[17:56] ? Towards Data-Efficient Pretraining for Atomic Property Prediction（面向原子性质预测的数据高效预训练）

[18:46] ? The Mirage of Model Editing: Revisiting Evaluation in the Wild（模型编辑的幻象：重新审视实际应用中的评估）

[19:31] ? Large Language Models and Mathematical Reasoning Failures（大型语言模型与数学推理失败）

[20:11] ? Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance（语言复杂度测量作为评估LLM性能的噪声零样本代理）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签