本期的 13 篇论文如下:
[00:22] 🎨 OmniSVG: A Unified Scalable Vector Graphics Generation Model(OmniSVG:一个统一的可扩展矢量图形生成模型)
[01:02] 🧠 Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought(Skywork R1V:以思维链引领多模态推理)
[01:42] 🖼 An Empirical Study of GPT-4o Image Generation Capabilities(GPT-4o图像生成能力实证研究)
[02:22] 🚀 Hogwild! Inference: Parallel LLM Generation via Concurrent Attention(Hogwild! 推理:通过并发注意力机制实现并行LLM生成)
[03:03] 🎨 Less-to-More Generalization: Unlocking More Controllability by In-Context Generation(由少及多泛化:通过上下文生成解锁更多可控性)
[03:46] 🧠 COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values(COIG-P:一个高质量、大规模的中文偏好数据集,用于与人类价值观对齐)
[04:24] 🤔 Generative Evaluation of Complex Reasoning in Large Language Models(大语言模型中复杂推理的生成式评估)
[05:14] 🎨 Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model(基于统一潜在扩散模型的保真性和可编辑性免调优图像编辑)
[05:53] 🎮 V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models(V-MAGE:一个用于评估多模态大语言模型中以视觉为中心的能力的游戏评估框架)
[06:32] 🧩 CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation(CrossWordBench:利用可控谜题生成评估大型语言模型和大型视觉语言模型的推理能力)
[07:15] 🖼 HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance(HiFlow:基于流对齐引导的免训练高分辨率图像生成)
[07:57] 💡 Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence(通过单序列并行解码加速并行化推理)
[08:41] 🤖 Leanabell-Prover: Posttraining Scaling in Formal Reasoning(Leanabell-Prover:形式推理中的后训练缩放)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递