热点
关于我们
xx
xx
"
Grokking
" 相关文章
Muon Optimizer Significantly Accelerates Grokking in Transformers: Microsoft Researchers Explore Optimizer Influence on Delayed Generalization
MarkTechPost@AI
2025-04-23T06:10:36.000000Z
This AI Research from Ohio State University and CMU Discusses Implicit Reasoning in Transformers And Achieving Generalization Through Grokking
MarkTechPost@AI
2024-07-09T06:01:27.000000Z
Grokfast:通过放大慢梯度加速格罗克学习
buzz
2024-06-04T16:33:14.000000Z