Published on May 16, 2025 2:20 AM GMT
Note: This post is the second in a broader series of posts about the difficult tradeoffs inherent in public access to powerful open source models (the first post is here). While this post highlights some dangers of open models and discusses the possibility of global regulation, I am not, in general, against open source AI, or supportive of regulation of open source AI today. On the contrary, I believe open source software is, in general, one of humanity’s most important and valuable public goods. My goal in this series of posts is to call attention to the risks and challenges around open models now, so we can use the time we still have before risks become extreme, to collectively explore viable alternatives to regulation, if indeed such alternatives exist.
In the past year there have been a number of impressive efforts by AI safety researchers to create detailed “scenario models” that represent plausible stories for how the next several years might play out. The basic idea of these models is that while they won’t get everything right, the scenarios should at least be plausible in the sense that they are internally consistent and are at least mostly believable in terms of how the events unfold in the story. The best and most popular and best-developed examples of such scenario models that I’m aware of are Leopold Ascherbrenner’s Situational Awareness, @joshc’s How AI Might Take Over in 2 Years and recently AI 2027 by @Daniel Kokotajlo et al.
I have personally been very much inspired by all three of these efforts and believe they’ve had some of the greatest impact of any AI safety effort so far, especially in terms of raising public awareness around AI risks. However there is an aspect of such scenario models that I believe is – for lack of a better word – untenable, which is that so far they have either entirely ignored, or vastly downplayed the role open models are likely to play as the near future of AI unfolds.
Broadly-speaking, the three scenarios above focus almost entirely on either the AI arms race-style dynamics between leading nation-states (particularly the US and China) or the efforts of one or more leading labs to internally align their most powerful models, or both. The result of this is that the scenarios tend to portray essentially all key high-level global impacts from AI over the next several years as coming entirely from closed models controlled by just one or two leading labs and governments and the scenarios tend to unfold entirely as power struggles between them, or internally to them.
The problem with this kind of laser focus on a few key players is that it simply does not match the current reality of very powerful and widely distributed open models like DeepSeek V3 and Llama 4 or the fact that organizations like Meta are extremely ideologically committed to continuing to release their models in the open, even as capabilities and risks escalate. Or the fact that Open AI currently appears to be moving back towards releasing more open models and is set to release its first open model since GPT-2 this summer.
It would be one thing if the scenarios gave some high-level account of why they expect open models to go away. Or how they expect global regulation and enforcement of open models to be enacted on relatively short timescales – given that all available evidence is that it is not likely to be quick or easy at all. But instead the scenarios seem to simply ignore the fact that open models exist and appear to expect that they will have little to no meaningful role in the future. To try to quantify this a bit, AI 2027 mentions the term “open source” exactly once as a very brief aside later in the text and only as a topic that some “activists” are talking about. Open models are not mentioned in the scenario at all. Ditto in How AI Might Take Over in 2 Years, which does not mention open source or open models a single time.
To his credit, Ascherbrenner does mention open source AI briefly in a number of sections of Situational Awareness, but it is still very much treated as a side note and something he argues will be quickly rendered irrelevant:
Thus, open source models today are pretty good, and a bunch of companies have pretty good models (mostly depending on how much $$$ they raised and how big their clusters are). But this will likely change fairly dramatically in the next couple years. Basically all of frontier algorithmic progress happens at labs these days (academia is surprisingly irrelevant), and the leading labs have stopped publishing their advances. We should expect far more divergence ahead: between labs, between countries, and between the proprietary frontier and open source models.
This prediction is not holding up well so far and if anything the gap appears to be closing with Epoch AI estimating that open models are only behind the best closed models by about one year.
To be more specific, the most critical problems I see with not including an account of open source AI and open models in scenario modeling efforts are the following:
1. Even if open models lag behind, the risks from open models are generally much harder to mitigate.
I explore this issue in much greater detail with respect to loss of control-related risks in my post We Have No Plan for Preventing Loss of Control in Open Models, although many of the concerns I highlight in that post are general. For example, the concern that AI risks from open models are hard to mitigate is perhaps even more urgently true of AI-assisted CBRN risks, given that expert-level virology capabilities are already appearing in frontier models today. Scenarios like AI 2027 and Situational Awareness portray superintelligent AI as just 3 years away, but if that's true, how do we avoid dying from a supervirus created with the help of open model-based virology tools well before that point?
2. A much wider variety of actors will have access to open models, including many truly bad actors.
AI 2027 and How AI Might Take Over in 2 Years primarily focus on AI operators who, while morally compromised in many ways, still generally have a baseline interest in maintaining the current international order, avoiding violence if possible, promoting human flourishing and so on. But meanwhile in the real world today, open models are already being deployed at scale in combat drones in Ukraine and conflicts in the Middle East. And we are almost certain to see more Chaos GPT-style experiments with AI operators recklessly and intentionally relinquishing control of their AI agents over the next few years, or simply running them without safeguards in risk environments like capital markets or for cyber attacks in attempts to accumulate money or power.
While a country like North Korea or Iran or Russia may never be capable of training a model that is truly at the frontier, we would still be wise to worry very much about what they might do with all of their immense military capabilities, hacking expertise, compute resources and so on, given access to open models that are just one year behind the frontier, especially if they fear a permanent shift in the equilibrium of global power.
3. Open models may be much more risky in “slowdown” scenarios.
In the “slowdown” branch of AI 2027, “superintelligent AI” becomes available to the general public in mid-2028 (presumably with powerful guardrails in place) as miracle disease cures are arriving and robot factories are rampant and yet there is still no mention of what is happening with the open models of that day. How do we get through 2025, 2026 and 2027 with no super viruses? Or high-profile drone assassinations of political leaders?
More broadly, while it seems quite possible that a “slowdown” strategy could decrease loss-of-control risk coming from inside frontier labs (as suggested by AI 2027) it seems equally likely that catastrophic risks with lower capability requirements - such as AI-assisted CBRN and conventional AI combat drones - actually increase dramatically in slowdown scenarios, as we have longer to wait for the superintelligent AI policeman to arrive.
=-=-=-=-=
To be clear, my argument isn’t that open source AI must be the main storyline in any scenario modeling effort. But rather that it’s totally unrealistic and untenable to simply ignore open source AI or treat open models as a trivial detail when telling a story of the next 2-3 years, when models like DeepSeek V3 and Llama 4 exist today and there are leading labs, along with many industry leaders, elected officials and policymakers who are ideologically committed to continuing to release open models as close to the frontier as possible for the foreseeable future.
Discuss