少点错误 02月04日
Forecasting AGI: Insights from Prediction Markets and Metaculus
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文分析了预测市场和Metaculus上关于通用人工智能(AGI)时间线的问题,对比了不同平台的预测结果和评估标准。Metaculus预测AGI可能在2030年中期到来,但其标准包含机器人能力,这与某些AGI定义不符。Manifold则给出了未来几年AGI实现的概率,认为2028年前实现的概率接近50%。OpenAI今年宣布AGI的概率在24%-27%之间。作者认为AGI可能在2026年中到2027年底到来,并强调了预测的高度不确定性和国际协调的重要性。

📅 Metaculus 预测 AGI 大约在 2030 年中期实现,但其评估标准包括机器人能力,这一定义较为严格,部分人认为 AGI 不需要具备机器人能力。

💡 Manifold 市场预测 AGI 在未来几年内实现的概率,认为 2026 年前实现的概率为 14%,2028 年前实现的概率为 47%,2029 年前为 57%。

📢 Polymarket 和 Kalshi 预测 OpenAI 今年宣布 AGI 的概率分别为 27% 和 24%,但作者认为这可能高估了 AGI 实际到来的可能性,因为 OpenAI 可能出于市场策略目的提前宣布。

🤖 目前人工智能在机器人能力和 APPS 基准测试中的准确率方面仍有欠缺,但编码能力正在快速提升,这表明机器人技能可能是 AGI 实现的限制因素。

Published on February 4, 2025 1:03 PM GMT

I have tried to find all prediction market and Metaculus questions related to AGI timelines. Here I examine how they compare to each other, and what they actually say about when AGI might arrive.

If you know of a market that I have missed, please tell me in the comment section! It would also be helpful if you tell me about what questions you think are relevant but are missing from this analysis. This is a linkpost, and I prefer if you comment in the original post on my new blog, Forecasting AI Futures, but feel free to comment here as well. Subscribe to the blog for updates on my future forecasting posts related to AI safety.

Whenever possible, please check the more recent probability estimates in the embedded sites, instead of looking at my At The Time Of Writing (ATTOW) numbers.

So, what does prediction markets and Metaculus have to say about AGI?

Metaculus has this question for the arrival date of AGI:

The AI system needs to be able to:

Metaculus thinks this will probably occur around the middle of 2030, though with high uncertainty. The interval between the lower and upper quartiles for the individual predictions on this question is (2026-12-28 - 2039-03-27) ATTOW.

GPT-4o achieves an accuracy of 88.7% on MMLU, as seen in the leaderboard here. GPT-4 was used to get 22% accuracy on APPS. Unfortunately, most of the best models have not been tested on either MMLU or APPS.

OpenAI’s o3 has been reported of achieving 71.7% on SWE-bench Verified. We can compare that to GPT-4, which managed to achieve 22.4% on SWE-bench Verified and 22% accuracy on APPS. Based on this, I think o3 would manage to achieve above 50% accuracy on APPS.

The two criteria that AI currently seem furthest from fulfilling are the robotics capabilities and APPS accuracy, though current best performance on the APPS benchmark is uncertain. Coding capabilities are improving very fast, which indicated by the rapid improvements in accuracy in SWE-bench Verified, while robotics capabilities are lagging behind. If there are not too many errors in the APPS benchmark dataset, robotic skills are probably the limiting factor.

But even though this is an interesting thing to forecast, the first AI system fulfilling these criteria is not necessarily a true AGI. Like the current SOTA general-purpose AIs, that are getting really good at things like answering graduate-level questions (o3 achieves 87.7% accuracy on GPQA) and coding, the systems are not really agentic enough yet to, for example, replace all remote workers.

On the other hand, an AI system that can perform all purely intellectual tasks that a human can do might be developed before the robotics criteria is fulfilled. Depending on what definition of AGI you prefer, AGI might arrive before the resolution of the question above. These considerations imply that the question only provides a rough proxy for estimating actual AGI arrival time.

Manifold is of course also attempting to forecast AGI. A user called RemNi helpfully posted a question for the probability that AGI arrives before each of the coming years. The resolution criteria are relatively subjective but seem good enough to me: “AGI can theoretically perform any intellectual task that a human being can. It involves the capability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience.” Here are the links to the markets for the next 8 years, and their respective predictions ATTOW:

‌We can also follow how the probability has changed over time:

The difference between the Manifold and Metaculus predictions could be due to the different resolution criteria. Even if the Manifold version is a bit more subjective, to me it seems closer to an actual definition of AGI than the Metaculus question. I don’t think robotics capabilities are necessary for an AI system to be transformative, which an AGI would arguably be.

There are also markets for how probable it is that OpenAI announces AGI this year. Polymarket and Kalshi estimate 27% probability and 24% probability respectively for this ATTOW.

Of course, an announcement of AGI is not equivalent to AGI arrival. It seems like OpenAI and Microsoft have made a deal that Microsoft will lose access to OpenAI technology once AGI is achieved, and they have agreed on a monetary definition of AGI as an “AI system that can generate at least $100 billion in profits”. AGI could plausibly arrive before then, since AGI could be really expensive to run. But the criteria are “OpenAI or an official representative of the company announces that it has created an artificial general intelligence (AGI)”, and OpenAI could make such an announcement to the public before the monetary definition is achieved. It wouldn’t really surprise me if they announce AGI as a market strategy to hype their technology, so even if this market would be correctly priced according to the criteria, I think it is likely to provide an overestimation of how likely OpenAI is to actually achieve AGI this year. And since OpenAI seem to be in the frontline, it probably overestimates the probability of AGI this year in general.

Conclusion

To summarize some of the most important details:

Manifold estimates a 14% probability of AGI this year, a bit lower than Kalshi’s 24% probability and Polymarket’s 27% probability for OpenAI announcing AGI this year.

Metaculus expects AGI around the middle of 2030 but require robotics capabilities that some would disagree are necessary for AGI.

Manifold estimates a 47% probability of AGI before 2028. It also estimates a 43% probability that AGI arrives after 2025 and before 2029:
57% before 29 - 14% before 2026 = 43%.

What should you make of this? First, although several sources predict that AGI is likely to arrive within the next 10 years, there is high uncertainty. The predictions differ between Metaculus and Manifold, and between the respective resolution criteria. Manifold estimates about an even chance of AGI arriving before 2028, while Metaculus estimates an even chance of AGI arriving before the mid of 2030.

What do I think myself?

My best guess is that AGI arrives somewhere between the mid of 2026 and before the end of 2027, with above 50% probability. This approximately matches Manifold’s predictions, but with lower probability of AGI this year.

Also, the Manifold predictions can provide an estimation for the cumulative distribution function for AGI arrival. I think they get the shape of the function right, with the probability of AGI rising fast in the beginning and rising slower after 2029. If AGI takes that long, it might mean that it was harder than expected to achieve. Additionally, humanity will have had more time to coordinate around regulation and international treaties, perhaps even pulling off an international pause, which would increase AGI timelines.

I will write more about international coordination in a separate post; prediction markets have many interesting things to say about that as well.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AGI 预测市场 Metaculus 人工智能时间线
相关文章