热点
关于我们
xx
xx
"
Moonlight
" 相关文章
Researchers from Moonshot AI Introduce Muon and Moonlight: Optimizing Large-Scale Language Models with Efficient Training Techniques
MarkTechPost@AI
2025-02-25T17:32:57.000000Z
Moonshot AI and UCLA Researchers Release Moonlight: A 3B/16B-Parameter Mixture-of-Expert (MoE) Model Trained with 5.7T Tokens Using Muon Optimizer
MarkTechPost@AI
2025-02-23T04:50:15.000000Z