热点
关于我们
xx
xx
"
混合架构
" 相关文章
腾讯混元TurboS技术报告首次全公开:560B参数混合Mamba架构,自适应长短链融合
AI前线
2025-05-23T11:51:23.000000Z
Technology Innovation Institute TII Releases Falcon-H1: Hybrid Transformer-SSM Language Models for Scalable, Multilingual, and Long-Context Understanding
MarkTechPost@AI
2025-05-22T06:50:51.000000Z
RWKV-X Combines Sparse Attention and Recurrent Memory to Enable Efficient 1M-Token Decoding with Linear Complexity
MarkTechPost@AI
2025-05-05T18:10:34.000000Z
腾讯混元、英伟达都发混合架构模型,Mamba-Transformer要崛起吗?
36kr
2025-03-24T09:47:31.000000Z
腾讯混元、英伟达都发混合架构模型,Mamba-Transformer要崛起吗?
机器之心
2025-03-24T06:52:03.000000Z
腾讯混元自研深度思考模型 T1 发布:吐字快、能秒回,擅长超长文处理
IT之家
2025-03-21T15:42:02.000000Z
英伟达提出首个Mamba-Transformer视觉骨干网络!打破精度/吞吐瓶颈 | CVPR 2025
智源社区
2025-03-09T06:09:22.000000Z
英伟达提出首个Mamba-Transformer视觉骨干网络!打破精度/吞吐瓶颈 | CVPR 2025
新智元
2025-03-08T07:10:40.000000Z
MiniMax-Text-01 and MiniMax-VL-01 Released: Scalable Models with Lightning Attention, 456B Parameters, 4B Token Contexts, and State-of-the-Art Accuracy
MarkTechPost@AI
2025-01-15T20:02:58.000000Z
This AI Paper Introduces TinyViM: A Frequency-Decoupling Hybrid Architecture for Efficient and Accurate Computer Vision Tasks
MarkTechPost@AI
2024-12-01T08:20:03.000000Z
对话 Zircuit:了解币安、Pantera 和 Dragonfly 支持的混合架构 L2
ForesightNews文章
2024-11-01T09:25:40.000000Z
LongLLaVA: A Breakthrough Hybrid Architecture Combining Mamba and Transformer Layers to Efficiently Process Large-Scale Multi-Modal Data with Unmatched Accuracy and Performance
MarkTechPost@AI
2024-09-12T18:35:49.000000Z
拯救Transformer推理能力,DeepMind新研究TransNAR:给模型嵌入“算法推理大脑”
36kr
2024-06-17T09:03:52.000000Z