TechCrunch News 02月06日
Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

斯坦福和华盛顿大学的研究人员成功地以低于50美元的云计算成本训练出一个AI“推理”模型s1。该模型在数学和编码能力测试中表现与OpenAI的o1和DeepSeek的r1等领先模型相似。s1模型及其训练数据和代码已在GitHub上开源。研究团队通过知识蒸馏,从谷歌的Gemini 2.0 Flash Thinking Experimental中提取“推理”能力来创建s1。这一成果引发了关于AI模型商品化的讨论,以及对大型AI实验室垄断地位的挑战。研究还发现,通过监督微调(SFT)和小规模数据集,可以相对廉价地复制推理模型的能力。

💰 低成本突破:斯坦福团队以不到50美元的云计算成本训练出与OpenAI和DeepSeek等领先模型性能相近的AI推理模型s1,展示了低成本AI创新的可能性。

🧠 知识蒸馏技术:s1模型的成功归功于知识蒸馏技术,通过从谷歌的Gemini 2.0 Flash Thinking Experimental中提取“推理”能力,实现了模型的快速训练和高效性能。

📚 监督微调(SFT):研究表明,通过使用小规模数据集和监督微调方法,可以有效地复制AI模型的推理能力,降低了AI模型训练的成本和门槛。

🤔 “等待”技巧:研究人员发现,在s1模型的推理过程中加入“等待”指令,可以使其进行二次检查,从而提高答案的准确性。

AI researchers at Stanford and the University of Washington were able to train an AI “reasoning” model for under $50 in cloud compute credits, according to a new research paper released last Friday.

The model known as s1 performs similarly to cutting-edge reasoning models, such as OpenAI’s o1 and DeepSeek’s r1, on tests measuring math and coding abilities. The s1 model is available on GitHub, along with the data and code used to train it.

The team behind s1 said they created the AI model through distillation, a process to extract the “reasoning” capabilities from another AI model by training on its answers. The researchers said s1 is distilled from one of Google’s reasoning models, Gemini 2.0 Flash Thinking Experimental. Distillation is the same approach Berkeley researchers used to create an AI reasoning model for around $450 last month.

To some, the idea that a few researchers without millions of dollars behind them can still innovate in the AI space is exciting. But s1 raises real questions about the commoditization of AI models. Where’s the moat if someone can closely replicate a multi-million dollar model with relative pocket change?

Unsurprisingly, big AI labs aren’t happy. OpenAI has accused DeepSeek of improperly harvesting data from its API for the purposes of model distillation.

The researchers behind s1 were looking to find the simplest approach to achieve strong reasoning performance and “test-time scaling,” or allowing an AI model to think more before it answers a question. These were a few of the breakthroughs in OpenAI’s o1, which DeepSeek and other AI labs have tried to replicate through various techniques.

The s1 paper suggests that reasoning models can be distilled with a relatively small dataset using a process called supervised fine-tuning (SFT), in which an AI model is explicitly instructed to mimic certain behaviors in a dataset. SFT tends to be cheaper than the large-scale reinforcement learning method that DeepSeek employed to train its answer to OpenAI’s o1, R1.

Google offers free access to Gemini 2.0 Flash Thinking Experimental, albeit with daily rate limits, via its Google AI Studio platform. Its terms forbid reverse-engineering its models to develop services that compete with Google’s own AI offerings, however. We’ve reached out to Google for comment.

S1 is based on a small, off-the-shelf AI model from Alibaba-owned Chinese AI lab Qwen, which is available to download for free. To train s1, the researchers created a dataset of just 1,000 carefully curated questions, paired with answers to those questions as well as the “thinking” process behind each answer from Google’s Gemini 2.0 Flash Thinking Experimental.

After training s1, which took less than 30 minutes using 16 Nvidia H100 GPUs, s1 achieved strong performance on certain AI benchmarks, according to the researchers. Niklas Muennighoff, a Stanford researcher who worked on the project, told TechCrunch he could rent the necessary compute today for about $20.

The researchers used a nifty trick to get s1 to double-check its work and extend its “thinking” time: they told it to wait. Adding the word “wait” during s1’s reasoning helped the model arrive at slightly more accurate answers, per the paper.

In 2025, Meta, Google, and Microsoft plan to invest hundreds of billions of dollars in AI infrastructure, which will partially go toward training next-generation AI models. That level of investment may still be necessary to push the envelope of AI innovation. Distillation has shown to be a good method for cheaply recreating an AI model’s capabilities, but it doesn’t create new AI models vastly better than what’s available today.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI推理模型 知识蒸馏 低成本AI 监督微调
相关文章