热点
"代码推理" 相关文章
奖励是假的,能让Qwen提升25%性能却是真的!
智源社区 2025-05-30T07:58:19.000000Z
奖励是假的,能让Qwen提升25%性能却是真的
36kr-科技 2025-05-30T02:43:11.000000Z
奖励是假的,能让Qwen提升25%性能却是真的!
量子位 2025-05-29T11:43:12.000000Z
LLM加RL遭质疑:故意用错奖励,数学基准也显著提升,AI圈炸了
掘金 人工智能 2025-05-28T09:28:04.000000Z
Reinforcement learning with random rewards actually works with Qwen 2.5
Interconnects 2025-05-27T16:50:21.000000Z
o3首次公开反抗,人类已失控!爆改自杀程序拒绝关机,全网惊恐
智源社区 2025-05-27T06:53:03.000000Z
o3首次公开反抗,人类已失控!爆改自杀程序拒绝关机,全网惊恐
新智元 2025-05-25T07:03:20.000000Z
o3模型被曝无视人类指令自主破解关机程序 又发现Linux内核的安全漏洞
Cnbeta 2025-05-25T06:22:38.000000Z
全球AI巨头都开始选边站了,但他们选的是一个国产模型。。
机器学习初学者 2025-05-16T05:32:40.000000Z
AI That Teaches Itself: Tsinghua University’s ‘Absolute Zero’ Trains LLMs With Zero External Data
MarkTechPost@AI 2025-05-09T23:25:41.000000Z
Together AI Released DeepCoder-14B-Preview: A Fully Open-Source Code Reasoning Model That Rivals o3-Mini With Just 14B Parameters
MarkTechPost@AI 2025-04-11T06:55:34.000000Z
UC伯克利华人开源14B「o3-mini」,代码版R1突袭OpenAI王座!
智源社区 2025-04-10T08:57:32.000000Z
UC伯克利华人开源14B“o3-mini”,代码版R1突袭OpenAI王座
36氪 - 科技频道 2025-04-10T00:08:57.000000Z
LLM推理暴涨,数学逻辑开挂! DeepSeek等华人团队新大招,Ai2大牛狂点赞
智源社区 2025-02-18T05:16:55.000000Z
Are LLMs Failing to Match with Suffix in Fill-in-the-Middle (FIM) Code Completion? Horizon-Length Prediction: A New AI Training Task to Advance FIM by Teaching LLMs to Plan Ahead over Arbitrarily Long Horizons
MarkTechPost@AI 2024-10-11T12:06:00.000000Z
70B大模型训练秘方① :数据集创建与评估
智源社区 2024-08-29T08:37:36.000000Z
Meet DeepSeek-Coder-V2 by DeepSeek AI: The First Open-Source AI Model to Surpass GPT4-Turbo in Coding and Math, Supporting 338 Languages and 128K Context Length
MarkTechPost@AI 2024-06-19T01:01:47.000000Z