热点
"优化器" 相关文章
Muon Optimizer Significantly Accelerates Grokking in Transformers: Microsoft Researchers Explore Optimizer Influence on Delayed Generalization
MarkTechPost@AI 2025-04-23T06:10:36.000000Z
【 ICLR 2025 】Adam 有了 mini 版:内存减半,吞吐量可提升 50%
AI科技评论 2025-04-11T16:26:02.000000Z
神经网络优化器进化论:从SGD到RAD,读懂AI训练的内功心法
智源社区 2025-03-04T07:33:21.000000Z
【深度学习】通透!十大 pytorch 核心操作全总结!!
机器学习初学者 2025-03-03T06:55:53.000000Z
Researchers from Moonshot AI Introduce Muon and Moonlight: Optimizing Large-Scale Language Models with Efficient Training Techniques
MarkTechPost@AI 2025-02-25T17:32:57.000000Z
9种神经网络优化算法详解
掘金 人工智能 2025-02-07T06:17:34.000000Z
【深度学习】50个超强pytorch操作!!
机器学习初学者 2024-12-15T06:32:35.000000Z
Level 1 and Level 2 Optimizers, or Tendimizers and Optimizers
少点错误 2024-11-13T03:07:03.000000Z
手把手教你入门GPT有幸入选雪球2024年度十大影响力用户提名,插个投票链接在此,感谢大家支持DrChuck:网页链接2022年底,GPT3横空出世,改变了整个世界。可惜Op...
雪球网今日 2024-11-07T08:02:59.000000Z
ArXiv 2024 | 揭秘视觉表征学习中的骨干网络-优化器耦合偏好
我爱计算机视觉 2024-10-16T13:43:07.000000Z
Can We Optimize Large Language Models Faster Than Adam? This AI Paper from Harvard Unveils SOAP to Improve and Stabilize Shampoo in Deep Learning
MarkTechPost@AI 2024-09-20T10:20:34.000000Z
This AI Paper from Apple Introduces AdEMAMix: A Novel Optimization Approach Leveraging Dual Exponential Moving Averages to Enhance Gradient Efficiency and Improve Large-Scale Model Training Performance
MarkTechPost@AI 2024-09-08T13:20:18.000000Z
The Real Deal on Language Model Optimizers: Performance and Practicality
MarkTechPost@AI 2024-07-16T06:31:30.000000Z
Adam-mini: A Memory-Efficient Optimizer Revolutionizing Large Language Model Training with Reduced Memory Usage and Enhanced Performance
MarkTechPost@AI 2024-07-02T14:16:42.000000Z
MIPRO: A Novel Optimizer that Outperforms Baselines on Five of Six Diverse Language Model LM Programs Using a Best-in-Class Open-Source Model (Llama-3-8B) by 12.9% accuracy
MarkTechPost@AI 2024-06-24T07:32:05.000000Z
Rethinking Neural Network Efficiency: Beyond Parameter Counting to Practical Data Fitting
MarkTechPost@AI 2024-06-22T18:01:39.000000Z