Researchers created an AI reasoning model on par with OpenAIs o1 for less than $50

Mashable 02月07日

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

斯坦福大学和华盛顿大学的研究人员开发了一种低成本AI推理模型，在数学和编码方面可与OpenAI o1和DeepSeek R1模型相媲美，而计算成本仅需不到50美元的云计算信用。该模型仅用1000个问题进行训练，耗时26分钟，使用了16个Nvidia H100 GPU。研究人员利用谷歌的Gemini Thinking Experimental模型的输出，并对阿里巴巴旗下的Qwen预训练模型进行了监督微调，通过控制计算时间来提高模型性能。这一成果以及其他开源推理模型的出现，预示着AI领域的力量正在从少数巨头向更多开发者转移。

💡斯坦福大学和华盛顿大学的研究人员开发出一种低成本的AI推理模型，其性能在数学和编码方面与OpenAI o1和DeepSeek R1模型相当，但成本仅为不到50美元的云计算费用。

💰该模型仅使用了1000个精心策划的问题进行训练，耗时26分钟，使用了16个Nvidia H100 GPU，这表明通过优化训练数据和硬件配置，可以大幅降低AI模型的开发成本。

⏱️研究人员通过控制模型的“思考时间”来提高其性能。他们创建了一个token预算来限制模型的计算时间，并发现增加思考时间可以提高结果的准确性，这表明推理过程的效率对模型性能至关重要。

🌐除了斯坦福大学的研究，UC Berkeley的研究人员发布了Sky-T1开源推理模型，成本仅为450美元。微软亚洲研究院也推出了开源rStar-Math推理模型，HuggingFace也在积极复制DeepSeek的R1模型。这些开源项目的涌现加速了AI技术的普及。

💪开发者可以通过API、开源访问甚至通过提炼闭源数据来构建在现有AI模型之上，从而大大降低成本。这使得更多的人能够参与到AI开发中，加速了AI技术的创新和应用。

The floodgates have opened for building AI reasoning models on the cheap.

Researchers at Stanford and the University of Washington have developed a model that performs comparably to OpenAI o1 and DeepSeek R1 models in math and coding — for less than $50 of cloud compute credits.

What's more, the model was trained on only 1,000 questions, and took just 26 minutes and 16 Nvidia H100 GPUs. Stanford researcher Niklas Muennighoff said in a email to Mashable that the cost is an estimate based on the GPU runtime and number of H100 GPUs used.

The AI industry of late is all about how new approaches to the pre and post training process can massively save computing costs, as evidenced by DeepSeek's disruptive impact. On top of that, developers are now able to build on top of existing AI models at little or no cost, through APIs, open-source access, and even closed-source models by distilling their data, bringing the costs down even more.

According to the team's research paper which was published last Friday, s1 was trained on a dataset consisting of "1,000 carefully curated questions paired with reasoning traces and answers distilled from Gemini Thinking Experimental." Google's Gemini Thinking Experimental model is accessible with daily limits through AI Studio. While it's a closed-source model, that clearly hasn't stopped researchers from making use of its responses.

Next, the researchers used an "off the shelf" pretrained model from Alibaba-owned lab, Qwen, and performed supervised fine-tuning of its curated dataset. Then, the team created a token budget to control the amount of compute time for testing the model. If s1 went over budget on thinking tokens, it was cut off and forced to generate whatever answer it came up with. If the researchers wanted the model to spend more "test-time compute" on a problem, they would simply tell the model to "wait," which extended its thinking time and led to more accurate results.

By controlling the amount of time and compute spent on a problem, the researchers were able to show how increased thinking team leads to improved performance.

S1 is one example of open-source reasoning models that have been developed for a fraction of the cost of flagship models from Google and OpenAI. In January, UC Berkeley researchers released an open-source reasoning model called Sky-T1 that cost $450, "demonstrating that it is possible to replicate high-level reasoning capabilities affordably and efficiently," per its blog post. There's also the open-source rStar-Math reasoning model from Microsoft Asia researchers, Tulu 3 from non profit research institute Ai2, and HuggingFace has its own initiative to replicate DeepSeek's R1.

As high-quality models become more accessible and cheaper, we're starting to see a power shift from the few AI heavy hitters, to the many.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签