本期的 12 篇论文如下:
00:23 🧲 Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights(拖拽式大语言模型:零样本提示到权重)
01:04 🖼 Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding(视觉引导分块:增强RAG的多模态文档理解方案)
01:49 🔀 PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models(PAROAttention:视觉生成模型中高效稀疏和量化注意力的模式感知重排序)
02:30 🤖 VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning(VIKI-R:通过强化学习协调具身多智能体合作)
03:08 🎮 Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition(Hunyuan-GameCraft:基于混合历史条件的高动态交互式游戏视频生成)
03:48 🖼 DreamCube: 3D Panorama Generation via Multi-plane Synchronization(DreamCube:基于多平面同步的3D全景图生成)
04:26 🖼 Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details(Hunyuan3D 2.5:迈向具有极致细节的高保真3D资产生成)
05:06 💽 InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding(InfiniPot-V:面向流视频理解的内存约束KV缓存压缩)
05:48 🖼 Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material(Hunyuan3D 2.1:从图像到具有生产级PBR材质的高保真3D资产)
06:36 🧠 UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation(UniFork:探索模态对齐以实现统一的多模态理解与生成)
07:16 ⚖ Reranking-based Generation for Unbiased Perspective Summarization(基于重排序生成方法的无偏视角摘要)
07:52 🚗 Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation(基于交错自回归运动和场景生成的长期交通仿真)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递