All Content from Business Insider 07月20日 07:09
OpenAI just won gold at the world's most prestigious math competition. Here's why that's a big deal.
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

OpenAI最新推出的实验性语言模型在国际数学奥林匹克(IMO)考试中取得了显著成就,成功解答了六道题中的五道,展现了AI在解决复杂数学问题上的强大能力。这一突破标志着AI在过去十年间的飞速发展,尤其是在通用推理和创造性思维方面。虽然该模型尚未公开,但其表现预示着AI在通用智能领域将迈出重要一步,也引发了关于AI未来实用性与发展方向的讨论。

🌟 OpenAI的最新实验性语言模型在具有挑战性的国际数学奥林匹克(IMO)考试中,成功解决了五道题目中的四道,达到了金牌水平的性能,这是AI在通用智能领域的一项重大里程碑。

🧠 该模型在IMO考试中的表现,特别是其展现出的持续创造性思维能力,与以往的基准测试相比有了显著提升,表明AI在解决需要深度思考和推理的问题上取得了长足进步。

🚀 此次成果被OpenAI首席执行官萨姆·奥特曼视为AI十年发展的“重要标志”,并强调该模型是通用人工智能(AGI)探索的一部分,而非专门为数学设计的系统。

🤔 尽管取得了令人印象深刻的成就,但AI批评者加里·马库斯对模型的训练方式、通用智能的范围以及对普通大众的实用性提出了疑问,并指出IMO的官方结果尚未独立验证。

OpenAI's latest model outperformed the brightest at the International Math Olympiad.

OpenAI's latest experimental model is a math whiz, performing so well on an insanely difficult math exam that everyone's now talking about it.

"I'm excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world's most prestigious math competition — the International Math Olympiad (IMO)," Alexander Wei, a member of OpenAI's technical staff, said on X.

The International Math Olympiad is a global competition that began in 1959 in Romania and is now considered one of the hardest in the world. It's divided into two days, during which participants are given a four-and-a-half-hour exam, each with three questions. Some famous winners include Grigori Perelman, who helped advance geometry, and Terence Tao, recipient of the Fields Medal, the highest honor in mathematics.

In June, Tao predicted on Lex Fridman's podcast that AI would not score high on the IMO. He suggested researchers shoot a bit lower. "There are smaller competitions. There are competitions where the answer is a number rather than a long-form proof," he said.

Yet OpenAI's latest model solved five out of six of the problems correctly, working under the same testing conditions as humans, Wei said.

Wei's colleague, Noam Brown, said the model displayed a new level of endurance during the exam.

"IMO problems demand a new level of sustained creative thinking compared to past benchmarks," he said. "This model thinks for a long time."

Wei said the model is an upgrade in general intelligence. The model's performance is "breaking new ground in general-purpose reinforcement learning," he said. DeepMind's AlphaGeometry, by contrast, is specifically designed just to do math.

"This is an LLM doing math and not a specific formal math system; it is part of our main push towards general intelligence," Altman said on X.

"When we first started openai, this was a dream but not one that felt very realistic to us; it is a significant marker of how far AI has come over the past decade," Altman wrote, referring to the model's performance at IOM.

Altman added that a model with a "gold level of capability" will not be available to the public for "many months."

The achievement is an example of how fast the technology is developing. Just last year, "AI labs were using grade school math" to evaluate models, Brown said. And tech billionaire Peter Thiel said last year it would take at least another three years before AI could solve US Math Olympiad problems.

Still, there are always skeptics.

Gary Marcus, a well-known critic of AI hype, called the model's performance "genuinely impressive" on X. But he also posed several questions about how the model was trained, the scope of its "general intelligence," the utility for the general population, and the cost per problem. Marcus also said that the IMO has not independently verified these results.

Read the original article on Business Insider

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

OpenAI AI 数学 国际数学奥林匹克 通用智能
相关文章