少点错误 02月04日
Forecasting AGI: Insights from Prediction Markets and Metaculus
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文对比了预测市场和Metaculus对通用人工智能(AGI)时间线的预测。Metaculus预测AGI可能在2030年中期实现,但其标准包括机器人能力,这被一些人认为不必要。Manifold则预测AGI在2028年前实现的概率接近50%。OpenAI宣布AGI的概率也被预测市场关注,尽管这不等同于AGI的实际到来。文章还探讨了不同预测标准的影响,并指出AGI的实现仍存在高度不确定性。作者个人认为AGI可能在2026年中到2027年底之间实现。

📅 Metaculus预测AGI大约在2030年中期到来,但要求AGI具备通过图灵测试、机器人能力(组装模型)以及在MMLU和APPS基准测试中达到一定准确率等标准。目前AI在机器人能力和APPS准确率方面仍有差距。

🤔 Manifold市场预测AGI在未来几年的概率,例如2026年前14%,2027年前34%,2028年前47%。Manifold的标准更侧重于AGI在智力任务上的能力,认为机器人能力并非必要条件。

📣 Polymarket和Kalshi预测OpenAI今年宣布AGI的概率分别为27%和24%。但需要注意的是,OpenAI宣布AGI并不等同于AGI真正实现,可能受到市场策略等因素影响。

Published on February 4, 2025 12:55 PM GMT

I have tried to find all prediction market and Metaculus questions related to AGI timelines. Here I examine how they compare to each other, and what they actually say about when AGI might arrive.

If you know of a market that I have missed, please tell me in the comment section! It would also be helpful if you tell me about what questions you think are relevant but are missing from this analysis. This is a linkpost, and I prefer if you comment in the original post on my new blog, Forecasting AI Futures, but feel free to comment here as well. Subscribe to the blog for updates on my future forecasting posts related to AI safety.

Whenever possible, please check the more recent probability estimates in the embedded sites, instead of looking at my At The Time Of Writing (ATTOW) numbers.

So, what does prediction markets and Metaculus have to say about AGI?

Metaculus has this question for the arrival date of AGI:

The AI system needs to be able to:

Metaculus thinks this will probably occur around the middle of 2030, though with high uncertainty. The interval between the lower and upper quartiles for the individual predictions on this question is (2026-12-28 - 2039-03-27) ATTOW.

GPT-4o achieves an accuracy of 88.7% on MMLU, as seen in the leaderboard here. GPT-4 was used to get 22% accuracy on APPS. Unfortunately, most of the best models have not been tested on either MMLU or APPS.

OpenAI’s o3 has been reported of achieving 71.7% on SWE-bench Verified. We can compare that to GPT-4, which managed to achieve 22.4% on SWE-bench Verified and 22% accuracy on APPS. Based on this, I think o3 would manage to achieve above 50% accuracy on APPS.

The two criteria that AI currently seem furthest from fulfilling are the robotics capabilities and APPS accuracy, though current best performance on the APPS benchmark is uncertain. Coding capabilities are improving very fast, which indicated by the rapid improvements in accuracy in SWE-bench Verified, while robotics capabilities are lagging behind. If there are not too many errors in the APPS benchmark dataset, robotic skills are probably the limiting factor.

But even though this is an interesting thing to forecast, the first AI system fulfilling these criteria is not necessarily a true AGI. Like the current SOTA general-purpose AIs, that are getting really good at things like answering graduate-level questions (o3 achieves 87.7% accuracy on GPQA) and coding, the systems are not really agentic enough yet to, for example, replace all remote workers.

On the other hand, an AI system that can perform all purely intellectual tasks that a human can do might be developed before the robotics criteria is fulfilled. Depending on what definition of AGI you prefer, AGI might arrive before the resolution of the question above. These considerations imply that the question only provides a rough proxy for estimating actual AGI arrival time.

Manifold is of course also attempting to forecast AGI. A user called RemNi helpfully posted a question for the probability that AGI arrives before each of the coming years. The resolution criteria are relatively subjective but seem good enough to me: “AGI can theoretically perform any intellectual task that a human being can. It involves the capability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience.” Here are the links to the markets for the next 8 years, and their respective predictions ATTOW:

‌We can also follow how the probability has changed over time:

The difference between the Manifold and Metaculus predictions could be due to the different resolution criteria. Even if the Manifold version is a bit more subjective, to me it seems closer to an actual definition of AGI than the Metaculus question. I don’t think robotics capabilities are necessary for an AI system to be transformative, which an AGI would arguably be.

There are also markets for how probable it is that OpenAI announces AGI this year. Polymarket and Kalshi estimate 27% probability and 24% probability respectively for this ATTOW.

Of course, an announcement of AGI is not equivalent to AGI arrival. It seems like OpenAI and Microsoft have made a deal that Microsoft will lose access to OpenAI technology once AGI is achieved, and they have agreed on a monetary definition of AGI as an “AI system that can generate at least $100 billion in profits”. AGI could plausibly arrive before then, since AGI could be really expensive to run. But the criteria are “OpenAI or an official representative of the company announces that it has created an artificial general intelligence (AGI)”, and OpenAI could make such an announcement to the public before the monetary definition is achieved. It wouldn’t really surprise me if they announce AGI as a market strategy to hype their technology, so even if this market would be correctly priced according to the criteria, I think it is likely to provide an overestimation of how likely OpenAI is to actually achieve AGI this year. And since OpenAI seem to be in the frontline, it probably overestimates the probability of AGI this year in general.

Conclusion

To summarize some of the most important details:

Manifold estimates a 14% probability of AGI this year, a bit lower than Kalshi’s 24% probability and Polymarket’s 27% probability for OpenAI announcing AGI this year.

Metaculus expects AGI around the middle of 2030 but require robotics capabilities that some would disagree are necessary for AGI.

Manifold estimates a 47% probability of AGI before 2028. It also estimates a 43% probability that AGI arrives after 2025 and before 2029:
57% before 29 - 14% before 2026 = 43%.

What should you make of this? First, although several sources predict that AGI is likely to arrive within the next 10 years, there is high uncertainty. The predictions differ between Metaculus and Manifold, and between the respective resolution criteria. Manifold estimates about an even chance of AGI arriving before 2028, while Metaculus estimates an even chance of AGI arriving before the mid of 2030.

What do I think myself?

My best guess is that AGI arrives somewhere between the mid of 2026 and before the end of 2027, with above 50% probability. This approximately matches Manifold’s predictions, but with lower probability of AGI this year.

Also, the Manifold predictions can provide an estimation for the cumulative distribution function for AGI arrival. I think they get the shape of the function right, with the probability of AGI rising fast in the beginning and rising slower after 2029. If AGI takes that long, it might mean that it was harder than expected to achieve. Additionally, humanity will have had more time to coordinate around regulation and international treaties, perhaps even pulling off an international pause, which would increase AGI timelines.

I will write more about international coordination in a separate post; prediction markets have many interesting things to say about that as well.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AGI 预测市场 Metaculus 人工智能时间线
相关文章