热点
"语言模型训练" 相关文章
Apple Researchers Propose Cut Cross-Entropy (CCE): A Machine Learning Method that Computes the Cross-Entropy Loss without Materializing the Logits for all Tokens into Global Memory
MarkTechPost@AI 2024-11-15T20:05:08.000000Z
This Machine Learning Research Discusses How Task Diversity Shortens the In-Context Learning (ICL) Plateau
MarkTechPost@AI 2024-10-21T04:36:08.000000Z
实现机器人领域的ChatGPT时刻,需要大模型+强化学习丨明星教授Sergey特邀报告
智源社区 2024-09-03T05:07:38.000000Z
How Important is the Reference Model in Direct Preference Optimization DPO? An Empirical Study on Optimal KL-Divergence Constraints and Necessity
MarkTechPost@AI 2024-08-01T06:34:34.000000Z