热点
关于我们
xx
xx
"
KL散度
" 相关文章
Off-Policy Reinforcement Learning RL with KL Divergence Yields Superior Reasoning in Large Language Models
MarkTechPost@AI
2025-06-02T04:56:04.000000Z
$500 + $500 Bounty Problem: An (Approximately) Deterministic Maximal Redund Always Exists
少点错误
2025-05-06T23:07:26.000000Z
【带你读】花书《深度学习》导读 第三章 概率与信息论 下
虎扑-热帖
2024-11-24T19:35:16.000000Z
如何准确且可解释地评估大模型量化效果?
智源社区
2024-08-10T08:07:28.000000Z
Beyond Accuracy: Evaluating LLM Compression with Distance Metrics
MarkTechPost@AI
2024-07-18T11:03:46.000000Z