热点
"Grokking" 相关文章
Muon Optimizer Significantly Accelerates Grokking in Transformers: Microsoft Researchers Explore Optimizer Influence on Delayed Generalization
MarkTechPost@AI 2025-04-23T06:10:36.000000Z
This AI Research from Ohio State University and CMU Discusses Implicit Reasoning in Transformers And Achieving Generalization Through Grokking
MarkTechPost@AI 2024-07-09T06:01:27.000000Z
Grokfast:通过放大慢梯度加速格罗克学习
buzz 2024-06-04T16:33:14.000000Z