TechCrunch News 2024年11月21日
A Chinese lab has released a ‘reasoning’ AI model to rival OpenAI’s o1
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

DeepSeek公司发布了名为DeepSeek-R1的推理AI模型,该模型在推理能力方面与OpenAI的o1模型相当。DeepSeek-R1通过更长时间的思考和规划来进行事实核查,避免了传统AI模型的一些缺陷。类似于o1,DeepSeek-R1在处理任务时会进行推理和规划,并采取一系列行动来找到答案,这可能需要数十秒的时间。DeepSeek-R1在两个AI基准测试中表现出色,但同时也存在一些问题,例如在一些逻辑问题上的表现不佳以及对敏感话题的回避。这可能是由于中国政府对AI模型的监管压力导致的,中国政府要求AI模型必须体现社会主义核心价值观,并限制了模型训练数据来源。随着“扩展定律”的有效性受到质疑,推理模型的研究越来越受到重视,DeepSeek-R1的出现也代表了AI领域的一种新的发展方向。

🤔DeepSeek公司发布了DeepSeek-R1推理AI模型,其推理能力与OpenAI的o1模型相当,通过更长时间的思考和规划来进行事实核查,避免传统AI模型的一些缺陷。

📊DeepSeek-R1在AIME和MATH两个AI基准测试中表现出色,与OpenAI的o1-preview模型性能相当,但同时也存在一些问题,例如在一些逻辑问题上的表现不佳。

⚠️DeepSeek-R1似乎屏蔽了被认为过于敏感的政治问题,例如关于中国领导人习近平、天安门广场和中国入侵台湾的政治影响等问题,这可能是由于中国政府对AI模型的监管压力导致的。

💡随着“扩展定律”的有效性受到质疑,推理模型的研究越来越受到重视,DeepSeek-R1的出现也代表了AI领域的一种新的发展方向,即测试时计算(test-time compute)。

💰DeepSeek由一家名为High-Flyer Capital Management的量化对冲基金支持,该基金使用AI来辅助交易决策,并拥有强大的算力资源,包括一个拥有10000个Nvidia A100 GPU的服务器集群。

A Chinese lab has unveiled what appears to be one of the first “reasoning” AI models to rival OpenAI’s o1.

On Wednesday, DeepSeek, an AI research company funded by quantitative traders, released a preview of DeepSeek-R1, which the firm claims is a reasoning model competitive with o1.

Unlike most models, reasoning models effectively fact-check themselves by spending more time considering a question or query. This helps them avoid some of the pitfalls that normally trip up models.

Similar to o1, DeepSeek-R1 reasons through tasks, planning ahead and performing a series of actions that help the model arrive at an answer. This can take a while. Like o1, depending on the complexity of the question, DeepSeek-R1 might “think” for tens of seconds before answering.

Image Credits:DeepSeek

DeepSeek claims that DeepSeek-R1 (or DeepSeek-R1-Lite-Preview, to be precise) performs on par with OpenAI’s o1-preview model on two popular AI benchmarks, AIME and MATH. AIME uses other AI models to evaluate a model’s performance, while MATH is a collection of word problems. But the model isn’t perfect. Some commentators on X noted that DeepSeek-R1 struggles with tic-tac-toe and other logic problems. (O1 does, too.)

DeepSeek-R1 also appears to block queries deemed too politically sensitive. In our testing, the model refused to answer questions about Chinese leader Xi Jinping, Tiananmen Square, and the geopolitical implications of China invading Taiwan.

Image Credits:DeepSeek

The behavior is likely the result of pressure from the Chinese government on AI projects in the region. Models in China must undergo benchmarking by China’s internet regulator to ensure their responses “embody core socialist values.” Reportedly, the government has gone so far as to propose a blacklist of sources that can’t be used to train models — the result being that many Chinese AI systems decline to respond to topics that might raise the ire of regulators.

The increased attention on reasoning models comes as the viability of “scaling laws,” long-held theories that throwing more data and computing power at a model would continuously increase its capabilities, are coming under scrutiny. A flurry of press reports suggest that models from major AI labs including OpenAI, Google, and Anthropic aren’t improving as dramatically as they once did.

That’s led to a scramble for new AI approaches, architectures, and development techniques. One is test-time compute, which underpins models like o1 and DeepSeek-R1. Also known as inference compute, test-time compute essentially gives models extra processing time to complete tasks.

“We are seeing the emergence of a new scaling law,” Microsoft CEO Satya Nadella said this week during a keynote at Microsoft’s Ignite conference, referencing test-time compute.

DeepSeek, which says that it plans to open source DeepSeek-R1 and release an API, is a curious operation. It’s backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions.

High-Flyer builds its own server clusters for model training, the most recent of which reportedly has 10,000 Nvidia A100 GPUs and cost 1 billion yen (~$138 million). Founded by Liang Wenfeng, a computer science graduate, High-Flyer aims to achieve “superintelligent” AI through its DeepSeek org.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

DeepSeek-R1 推理AI OpenAI o1 测试时计算 中国AI
相关文章