热点
"滑动窗口注意力" 相关文章
The Big LLM Architecture Comparison
Ahead of AI 2025-07-19T11:20:35.000000Z
Microsoft Researchers Introduce Samba 3.8B: A Simple Mamba+Sliding Window Attention Architecture that Outperforms Phi3-mini on Major Benchmarks
MarkTechPost@AI 2024-06-16T06:31:39.000000Z