热点
关于我们
xx
xx
"
测试时扩展
" 相关文章
过程奖励模型也可以测试时扩展?清华、上海AI Lab 23K数据让1.5B小模型逆袭GPT-4o
机器之心
2025-04-14T08:36:03.000000Z
This AI Paper Introduces ‘Shortest Majority Vote’: An Improved Parallel Scaling Method for Enhancing Test-Time Performance in Large Language Models
MarkTechPost@AI
2025-02-21T04:50:10.000000Z
DeepSeek-R1 Now Live With NVIDIA NIM
Nvidia Blog
2025-02-16T15:07:08.000000Z
How Scaling Laws Drive Smarter, More Powerful AI
Nvidia Blog
2025-02-16T15:07:07.000000Z
仅1k样本超越o1,媲美DeepSeek-R1蒸馏模型,李飞飞新作s1发布
PaperAgent
2025-02-09T16:22:03.000000Z
s1: A Simple Yet Powerful Test-Time Scaling Approach for LLMs
MarkTechPost@AI
2025-02-06T16:51:09.000000Z
Nvidia drops $600B off its market cap amid the rise of DeepSeek
TechCrunch News
2025-01-27T22:35:57.000000Z
R1风起,清华、港科大发布大模型强化推理技术最新全面综述
PaperAgent
2025-01-25T17:18:49.000000Z
This AI Paper Explores Reinforced Learning and Process Reward Models: Advancing LLM Reasoning with Scalable Data and Test-Time Scaling
MarkTechPost@AI
2025-01-19T19:34:57.000000Z
o3意味着什么?2025年“缩放定律”继续,成本更贵也更不可控
华尔街见闻
2024-12-24T08:18:35.000000Z
OpenAI’s o3 suggests AI models are scaling in new ways — but so are the costs
TechCrunch News
2024-12-24T00:22:10.000000Z