HuggingFace 每日AI论文速递 02月19日
2025.02.18 | 稀疏注意力提升效率,机器人起身策略优化。
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本期介绍了29篇论文,涉及原生稀疏注意力、人形机器人起身策略、大型语言模型的学习与应用等多个领域的研究成果。

⚡原生稀疏注意力:硬件对齐与原生可训练的稀疏注意力研究。

🤖学习真实世界人形机器人起身策略,推动机器人发展。

🧠ReLearn通过学习实现大型语言模型的遗忘,颇具创新性。

💻探讨前沿大语言模型能否从自由软件工程中获利等内容。

本期的 29 篇论文如下:

[00:23] ⚡ Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention(原生稀疏注意力:硬件对齐与原生可训练的稀疏注意力)

[01:10] ? Learning Getting-Up Policies for Real-World Humanoid Robots(学习真实世界人形机器人起身策略)

[01:55] ? ReLearn: Unlearning via Learning for Large Language Models(ReLearn:通过学习实现大型语言模型的遗忘)

[02:35] ? SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?(SWE-Lancer:前沿大语言模型能否从真实世界的自由软件工程中赚取100万美元?)

[03:21] ? HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation(赫尔墨斯流:无缝衔接多模态理解和生成)

[03:58] ? How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training(大型语言模型如何获取新知识?知识电路视角下的持续预训练)

[04:33] ? SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors(SURGE:关于大型语言模型作为通用代理代码执行器的潜力)

[05:12] ? Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening(扩散锐化:利用去噪轨迹锐化优化扩散模型微调)

[05:55] ? I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models(我思故我扩散:在扩散模型中实现多模态上下文推理)

[06:38] ? SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL(SAFE-SQL:基于细粒度示例选择的自增强上下文学习用于文本到SQL转换)

[07:25] ? CRANE: Reasoning with constrained LLM generation(CRANE:受限LLM生成的推理)

[08:07] ? Intuitive physics understanding emerges from self-supervised pretraining on natural videos(直觉物理理解从自然视频的自监督预训练中涌现)

[08:46] ? Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest(杜鹃:在大型语言模型的巢中孵化出的信息抽取搭便车者)

[09:22] ? Dyve: Thinking Fast and Slow for Dynamic Process Verification(Dyve:动态过程验证中的快思与慢想)

[10:06] ? PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning(物理推理:基于物理推理的综合基准)

[10:53] ? System Message Generation for User Preferences using Open-Source Models(基于开源模型的用户偏好系统消息生成)

[11:38] ? video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model(视频-SALMONN-o1:推理增强的音视频大型语言模型)

[12:33] ? Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity(构建一个在数据稀缺情况下比GPT-4o好64%的证明导向程序员)

[13:11] ? Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning(记忆、基准与机器人:一种用于强化学习解决复杂任务的基准)

[13:52] ? MagicArticulate: Make Your 3D Models Articulation-Ready(魔法清晰:让你的3D模型准备好关节动画)

[14:37] ? Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems(结构化交流,层次化行动:LLM多智能体系统的协作框架)

[15:21] ? One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs(一个示例展示,多个概念知晓!数学大语言模型中的反例驱动概念推理)

[16:03] ? Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model(单一模型能否同时掌握多轮对话与工具使用?CALM:一个统一的对话代理语言模型)

[16:40] ? Better Embeddings with Coupled Adam(结合Adam优化器的更好嵌入)

[17:18] ? Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking(展示工作:事实核查员对可解释自动化事实核查的需求)

[17:56] ? Towards Data-Efficient Pretraining for Atomic Property Prediction(面向原子性质预测的数据高效预训练)

[18:46] ? The Mirage of Model Editing: Revisiting Evaluation in the Wild(模型编辑的幻象:重新审视实际应用中的评估)

[19:31] ? Large Language Models and Mathematical Reasoning Failures(大型语言模型与数学推理失败)

[20:11] ? Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance(语言复杂度测量作为评估LLM性能的噪声零样本代理)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

论文研究 多领域 语言模型 机器人
相关文章