本期的 17 篇论文如下:
[00:24] ? Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers(巴别塔:服务于全球90%以上人口的开源多语言大型语言模型)
[01:11] ? ABC: Achieving Better Control of Multimodal Embeddings using VLMs(ABC:使用视觉语言模型实现多模态嵌入的更好控制)
[01:47] ? Enhancing Abnormality Grounding for Vision Language Models with Knowledge Descriptions(利用知识描述增强视觉语言模型在异常定位中的性能)
[02:24] ? GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control(GEN3C:具备精确相机控制和时间上3D一致性的生成视频模型)
[03:02] ? KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding(KodCode:一个多样、具有挑战性且可验证的合成代码数据集)
[03:43] ? CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom(CrowdSelect:基于多LLM智慧的合成指令数据选择)
[04:26] ? QE4PE: Word-level Quality Estimation for Human Post-Editing(QE4PE:面向人工译后编辑的词语级质量评估)
[05:08] ? Exploring Rewriting Approaches for Different Conversational Tasks(探索不同对话任务的重写方法)
[05:43] ? Process-based Self-Rewarding Language Models(基于过程的自奖励语言模型)
[06:23] ? Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective(针对特定领域的AI进行小型语言模型微调:边缘AI视角)
[07:00] ? Mixture of Structural-and-Textual Retrieval over Text-rich Graph Knowledge Bases(基于文本丰富图知识库的结构与文本混合检索)
[07:40] ? Retrieval Models Aren't Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models(检索模型不擅长工具使用:大型语言模型工具检索基准测试)
[08:22] ? FLAME: A Federated Learning Benchmark for Robotic Manipulation(FLAME: 机器人操作的联邦学习基准)
[09:01] ? Benchmarking Large Language Models for Multi-Language Software Vulnerability Detection(多语言软件漏洞检测的大语言模型基准测试)
[09:53] ? CognitiveDrone: A VLA Model and Evaluation Benchmark for Real-Time Cognitive Task Solving and Reasoning in UAVs(认知无人机:一种用于无人机实时认知任务解决和推理的VLA模型及评估基准)
[10:36] ? Interact, Instruct to Improve: A LLM-Driven Parallel Actor-Reasoner Framework for Enhancing Autonomous Vehicle Interactions(交互、指导以提升:一种用于增强自动驾驶车辆交互的LLM驱动并行行动者-推理者框架)
[11:14] ? SwiLTra-Bench: The Swiss Legal Translation Benchmark(SwiLTra-Bench:瑞士法律翻译基准)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递