HuggingFace 每日AI论文速递 03月01日
2025.02.28 | 自我校正提升数学推理,强化学习优化医疗推理。
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本期介绍了19篇论文,涵盖数学推理校正、医疗推理能力激励、多模态模型应用等多个领域的研究成果。

🧠Self-rewarding correction for mathematical reasoning,涉及数学推理校正

🧠MedVLM-R1,通过强化学习激励视觉语言模型的医疗推理能力

🧠R2-T2,测试时重路由在多模态专家混合模型中的应用

本期的 19 篇论文如下:

[00:23] ? Self-rewarding correction for mathematical reasoning(自我奖励的数学推理校正)

[01:03] ? MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning(MedVLM-R1:通过强化学习激励视觉语言模型的医疗推理能力)

[01:53] ? R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts(R2-T2:测试时重路由在多模态专家混合模型中的应用)

[02:34] ? LongRoPE2: Near-Lossless LLM Context Window Scaling(LongRoPE2:近乎无损的LLM上下文窗口扩展)

[03:11] ? FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving(FINEREASON:通过反思性谜题解决评估和改进大语言模型的深思熟虑推理)

[04:02] ? CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale(CODESYNC:大规模动态代码演化与大型语言模型同步)

[04:48] ? Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance(精简与高效:基于全局价值引导的解耦价值策略优化)

[05:33] ? UniTok: A Unified Tokenizer for Visual Generation and Understanding(UniTok:面向视觉生成与理解的统一分词器)

[06:12] ? NeoBERT: A Next-Generation BERT(NeoBERT:下一代BERT)

[06:47] ? FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute(FlexiDiT:让你的扩散Transformer轻松生成高质量样本,计算量更少)

[07:30] ? SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning(SoRFT:面向子任务的强化微调问题解决方法)

[08:07] ? Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting(基于高斯样条构建复杂 articulated 物体的交互式副本)

[08:45] ? Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think(多模态表示对齐用于图像生成:文本-图像交错控制比你想象的更简单)

[09:30] ? Mobius: Text to Seamless Looping Video Generation via Latent Shift(Mobius:通过潜在位移从文本生成无缝循环视频)

[10:08] ? Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System(代理系统守护者:通过代理系统防止多次越狱)

[10:49] ? R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning(通过推理学习全面激励大语言模型中的翻译能力)

[11:29] ? On Relation-Specific Neurons in Large Language Models(关于大型语言模型中的关系特定神经元)

[12:05] ? Training Consistency Models with Variational Noise Coupling(基于变分噪声耦合的训练一致性模型)

[12:46] ⚡ Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling(通过稀疏时变属性建模实现单目动态场景渲染的高效高斯光栅化)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

数学推理 医疗推理 多模态模型
相关文章