少点错误 2024年12月22日
Orienting to 3 year AGI timelines
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文预测通用人工智能(AGI)将在三年内实现,并探讨了这一时间线对行动的影响。文章分析了2025年至2027年间的重要变量和参与者,指出在AGI实现前,人类的主要任务是找到安全的方法将研究委托给AI,并确保安全干预措施的有效性。文章还强调了AGI实现后,AI代理的分配和优先事项至关重要。此外,文中还列出了人类生存的先决条件,包括合理的接管计划、强大的网络安全和避免全球紧张局势等。最后,文章提出了在短期AGI时间线下的稳健行动建议。

🗓️ 2025年底,AI助手能胜任大部分2小时的软件工程任务,公司员工开始依赖AI助手完成日常工作。这标志着AI技术在实际应用中取得了显著进展。

🤖 2026年底,AI能够完成多日编程任务,AGI公司员工意识到AI可能在两年内超越人类,政府开始介入并加强对AGI公司的监管,这反映了对AI快速发展的高度警惕。

🔒 2027年,AGI诞生,公司大部分工作由AI代理完成,人类员工主要负责解决AI代理遇到的难题。这表明AI在技术层面已经取得了突破性进展,并开始主导工作流程。

⚠️ 人类生存的先决条件包括制定合理的接管计划、采取有效的网络安全措施,以及避免因AGI发展而引发的全球紧张局势,这些都是在AGI时代保持稳定和安全的关键因素。

🤝 在短期AGI时间线内,应积极加入对齐研究,制定明确的计划,并尽早开始行动,以应对即将到来的挑战。这强调了行动的紧迫性和重要性。

Published on December 22, 2024 1:15 AM GMT

My median expectation is that AGI[1] will be created 3 years from now. This has implications on how to behave, and I will share some useful thoughts I and others have had on how to orient to short timelines.

I’ve led multiple small workshops on orienting to short AGI timelines and compiled the wisdom of around 50 participants (but mostly my thoughts) here. I’ve also participated in multiple short-timelines AGI wargames and co-led one wargame.

This post will assume median AGI timelines of 2027 and will not spend time arguing for this point. Instead, I focus on what the implications of 3 year timelines would be. 

I didn’t update much on o3 (as my timelines were already short) but I imagine some readers did and might feel disoriented now. I hope this post can help those people and others in thinking about how to plan for 3 year AGI timelines.

The outline of this post is:

A story for a 3 year AGI timeline

By the end of June 2025, SWE-bench is around 85%, RE-bench at human budget is around 1.1, beating the 70th percentile 8-hour human score. By the end of 2025, AI assistants can competently do most 2-hour real-world software engineering tasks. Whenever employees at AGI companies want to make a small PR or write up a small data analysis pipeline, they ask their AI assistant first. The assistant writes or modifies multiple interacting files with no errors most of the time. 

Benchmark predictions under 3 year timelines. A lot of the reason OSWorld and CyBench aren’t higher is because I’m not sure if people will report the results on those benchmarks. I don’t think things actually turning out this way would be strong evidence for 3 year timelines given the large disconnect between benchmark results and real world effects.

By the end of 2026, AI agents are competently doing multi-day coding tasks. The employees at AGI companies are thoroughly freaked out and expect that AI which can beat humans at 95% of virtual jobs is probably going to be created within 2 years. They also expect that superintelligence will follow soon after. The government realizes that AI will be decisive for national power and locks down the AGI companies in late 2026. This takes the form of extreme government oversight bordering on nationalization. Progress stays at a similar pace because of the race with other nuclear weapons states. 

Starting in 2027, most of the company’s quality-weighed workforce is made up by AI agents. The main decisions made by leadership are about allocating their workforce of millions of agents to various research areas, including AI R&D, safety, commercial applications, military applications, cyber, operations, communications, policy work, and most other types of work done on computers at the company. The human employees don’t matter much at this point except to attempt to help answer questions for groups of AI agents that get stuck and want a second opinion on their work.

AGI is created by the end of 2027. History probably doesn’t end here, but I will not go describe the post-AGI world in this post for brevity.

Important variables based on the year

Note that there’s a significant shift in dynamics in the middle of the story, which also imply significant shifts in the strategic landscape.

The pre-automation era (2025-2026). 

In 2025 and 2026, humans are still doing most of the work. Most important questions center about allocations of humans and commercial and regulatory pressures placed on AI labs and the rest of the supply chain. 

In the pre-automation era, humanity’s main priority should be very quickly finding safe ways to delegate research to AI agents. The main reason to do any type of safety-oriented research is to control these precursor agents who will later continue the research. 

Another priority of the pre-automation era is finding ways to tell whether our current safety interventions will be adequate to prevent large numbers of AI agents from scheming or doing other undesirable things. Part of this is also setting up systems to pause and convince others to pause in case an adequate safety case can’t be made. This will get harder as the race heats up.

The post-automation era (2027 onward). 

After 2026, AIs are doing most of the work. At this point, the research is mostly out of human hands, but the human employees are still involved in high-level decisions and interfacing with humans outside the AGI company. By the end of 2028, humans can no longer contribute to technical aspects of the research.

The main questions center around the allocation of AI agents, and their mandated priorities. Some important questions about this period are:

    How good is the broad research plan that the AI agents are pursuing?
      For example, if the human in charge of initially scoping out the research direction is someone who is fundamentally confused about AI safety, the hopes of aligning models might be doomed despite an initially well-meaning population of AI agents.
    How many company resources are invested in safety-oriented research?
      Allocating 0.1% of compute vs 25% of compute to safety might make a large difference in the success of the safety work that is done.

Important players

Note that "The AI Safety Community" is not part of this list. I think external people without much capital just won't have that much leverage over what happens.

Prerequisites for humanity’s survival which are currently unmet

This is not meant to be an exhaustive list.

    A sensible takeoff plan. Currently, AGI companies lack a vision for how to navigate safely handing off research to AI agents.
      The alignment approach - companies don’t have (public) default plans for which research areas to assign to their population of AI agents by default.Compute commitments - even with a sensible alignment approach, a lack of commitments might lead to an inadequate proportion of AI agents and compute to be allocated to it.Frontier safety frameworks - the requirements and commitments around SL-4 and SL-5 are currently very unclear, allowing a lot of wiggle room to cut corners during takeoff.Control - the science of safely handing off work to AI agents (or being able to tell that it’s not safe to do so) is very underdeveloped.
    State-proof cybersecurity. If bad actors can steal the weights of very advanced AI systems, things become extremely unpredictable due to misuse and enabling less careful entities to create advanced AI.A way to survive global tensions. The creation of AGI will disrupt the balance of military power between countries, possibly giving one entity a decisive strategic advantage. I think the probability of nuclear war in the next 10 years is around 15%. This is mostly due to the extreme tensions that will occur during takeoff by default. Finding ways to avoid nuclear war is important.
      During the cold war, there were multiple nuclear close calls that brought us close to annihilation. Some of these were a consequence of shifts in the strategic balance (e.g. the Cuban Missile Crisis). The US threatened the USSR with nuclear war over the Berlin Blockade. The creation of superintelligence will make these events seem trivial in comparison, and the question is just whether officials will realize this.
    Doing nationalization right.
      Getting the timing right. If nationalization happens too late (e.g. after AGI), the ensuing confusion and rapid change within the project might lead to bad decision making.Creating default plans. There will likely be a time in 2025 or 2026 that will be marked by significant political will to lock down the labs. If there don’t already exist any good default plans or roadmaps on how to do this, the plan will likely be suboptimal in many ways and written by people without the relevant expertise.Building political capital. Unless people with relevant expertise are well-known to important actors, the people appointed to lead the project will likely lack the relevant expertise.Keeping safety expertise through nationalization. A nationalization push which ousts all AI safety experts from the project will likely end up with the project lacking the technical expertise to make their models sufficiently safe. Decisions about which personnel the nationalized project will inherit will likely largely depend on how safety-sympathetic the leadership and the capabilities-focused staff are, which largely depends on building common knowledge about safety concerns.

Robustly good actions

Final thoughts

I know it can be stressful to think about short AGI timelines, but this should obviously not be taken as evidence that timelines are long. If you made your current plans under 10 or 20 year timelines, they should probably be changed or accelerated in many ways at this point.

One upside of planning under short timelines is that the pieces are mostly all in place right now, and thus it’s much easier to plan than e.g. 10 years ahead. We have a somewhat good sense of what needs to be done to make AGI go well. Let's make it happen.

Good luck navigating the singularity.

  1. ^

    I define AGI here as an AI system which is able to perform 95% of the remote labor that existed in 2022. I don’t think definitions matter that much anyways because once we reach AI R&D automation, basically every definition of AGI will be hit soon after (barring coordinated slowdowns or catastrophes).

  2. ^

    While they’re still using Slack that is. After strong government oversight, it’s unlikely that external human researchers will have any nontrivial sway over what happens on the inside.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AGI AI安全 时间线 AI代理 技术接管
相关文章