Mashable 07月22日 02:00
OpenAI claims gold medal performance at prestigious math competition, drama ensues
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

OpenAI未发布的推理模型在国际数学奥林匹克竞赛(IMO)上斩获金牌,引发了数学竞赛界的激烈讨论。该模型在六道题目中答对了五道,总得分35分(满分42分),达到了金牌水平。研究员Alexander Wei在X上分享了这一消息。然而,此次发布的时间点受到了批评,因为它可能掩盖了人类选手的成绩。据报道,IMO曾要求与该组织合作的AI实验室将发布时间推迟一周,以避免“抢走”学生们的风头。OpenAI则表示,他们并未与IMO正式合作,而是通过独立数学家验证了分数,因此不受任何协议约束。Google DeepMind也宣布其Gemini模型的先进版本在IMO上达到了金牌标准,并得到了IMO协调员的官方认证和评分。

🤖 OpenAI的未发布推理模型在国际数学奥林匹克竞赛(IMO)上表现出色,获得了金牌,显示了AI在解决复杂数学问题上的强大能力。该模型在五道题目中取得满分,总得分35分,证明了其在代数和初等微积分方面的推理和证明能力。

🏅 此次AI在IMO上的表现虽然令人瞩目,但其发布时间引发了争议。有报道称,IMO曾希望AI实验室能推迟发布,以免影响人类参赛选手的成绩,但OpenAI表示其结果是独立验证的,未与IMO达成协议。

🤝 与此同时,Google DeepMind也宣布其Gemini模型在IMO上取得了金牌标准,并且得到了IMO协调员的官方认证。这表明多家AI公司都在积极探索和展示其模型在高级数学竞赛中的能力,预示着AI在科学研究领域的进一步应用。

⚖️ 尽管OpenAI声称其发布结果符合规定,但围绕其行为的“粗鲁”和“不恰当”的批评仍在继续,这反映了在AI发展和成果发布过程中,如何平衡技术进步与传统竞赛的公平性和尊重之间的复杂性。

OpenAI announced its unreleased reasoning model won the gold at the International Mathematical Olympiad (IMO), igniting fierce drama in the world of competitive math.

While most high schoolers blissfully enjoy a break from school and homework, top math students from around the world brought their A-game to the IMO, considered the most prestigious math competition. AI labs also competed with their LLMs, and an unreleased model from OpenAI achieved a high-enough score to earn a gold medal, according to researcher Alexander Wei who shared the news on X.

The OpenAI model got five out the six problems correct, earning a gold medal-worthy score of 35 out of 42 points. "For each problem, three former IMO medalists independently graded the model’s submitted proof, with scores finalized after unanimous consensus," according to Wei. The problems are algebra and pre-calculus challenges that require creative thinking on the competitor's part. So for LLMs to be able to reason their way through long, complex proofs is an impressive achievement.

However, the timing of the announcement is being criticized for overshadowing the human competitors' results. The IMO reportedly asked the AI labs officially working with the organization verifying the results to wait a week before making any announcements, to avoid stealing the kids' thunder. That's according to an X post from Mikhail Samin, who runs the AI Governance and Safety Institute nonprofit. OpenAI said they didn't formally cooperate with the IMO to verify their results and instead worked with individual mathematicians to independently verify its scores, and so it wasn't beholden to any kind of agreement. Mashable sent a direct message to Samin on X for comment.

But the gossip is that this rubbed organizers the wrong way, who thought it was "rude" and "inappropriate" for OpenAI to do this. This is all hearsay, based on rumors from Samin, who also posted a screenshot of a similar comment from someone named Joseph Myers, presumably the two-time IMO gold medalist. Mashable contacted Myers for comment, but he has not publicly confirmed the authenticity of the screenshot.

In response, OpenAI researcher Noam Brown said they posted the results after the IMO closing ceremony, honoring an IMO organizer's request. Brown also said OpenAI wasn't in touch with IMO, suggesting they didn't make any agreements about announcing the results later.

Meanwhile, Google DeepMind reportedly did cooperate with the IMO, and announced this afternoon that an "advanced version of Gemini with Deep Think officially achieve[d] gold-medal standard at the International Mathematical Olympiad." According to the announcement, DeepMind's model was "officially graded and certified by IMO coordinators using the same criteria as for student solutions." Read into that statement as much or as little as you want, but the timing is hardly coincidental.

Others may follow the Real Housewives, but the proper decorum of elite math competitions is the high drama we live for.


Disclosure: Ziff Davis, Mashable’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

OpenAI 国际数学奥赛 AI模型 推理能力 Google DeepMind
相关文章