Published on June 20, 2025 11:42 AM GMT
Last month, think tank RAND published a report titled On the Extinction Risk from Artificial Intelligence and an accompanying blog post asking the question: “Could AI Really Kill Off Humans?” At the Existential Risk Observatory, this is precisely our expertise, so of course we were intrigued.
Author Michael Vermeer writes in the blog post: "Pandemics and nuclear war are real, tangible concerns, more so than AI doom, at least to me, a scientist at RAND." He goes on to say: "We swallowed any of our AI skepticism. (...) We were trying to take the risk of extinction seriously." It doesn't sound like this was a particularly easy job for them.
Indeed, their results end up being sceptical about the chance of human extinction caused by AI, despite many top AI researchers warning for exactly this. Their recommendations discourage pausing AI development for precautionary reasons. Only doing some AI safety research is deemed acceptable by RAND, but mostly if it would be good for other reasons anyway. So are the authors right, and can we rest assured? We don't think so.
Their research analyzed how AI might exploit three major ways to kill all humans: nuclear war, biological pathogens and climate change. However, they failed to mention any reason why future, advanced AI should limit its actions to these fields. What the authors basically did was try to look inside the mind of a superintelligence, something vastly smarter than they are, and predict what it will do. Not exactly a confidence-inspiring methodology.
The three scenarios the authors analyzed were judged by crunching the numbers: how easy is it to kill all human beings with only this intervention? For nuclear war, biological pathogens, and climate change, the authors arrive at the conclusion that this seems highly unlikely, since it is difficult to kill all humans. But we think they make a critical mistake here. Those taking over power, as AGI might do, seldom do so by killing all subjects in their territory of interest. After the bombing of Rotterdam in 1940, claiming 1150 lives, all the 9 million inhabitants of the Netherlands surrendered to the Nazis, as they considered further resistance to be meaningless. This is a ratio of only 1150/9M=0.01% deaths per inhabitant required for loss of control, the most important existential AI threat model. One could argue the Dutch did not fight particularly bravely, but other historical examples abound.
In general, the way to take over control by force is not to kill everyone, but to show sufficient strength and incapacitate enough of your opponent's ability to resist. One would guess that RAND, with its history of war and peace thinktankery, would not overlook this obvious fact. Once loss of control has occurred and an AI, or AI-led team, is in control, complete human extinction - if desired - can be brought about by any means and over any timespan. The bar to achieve loss of control, and thereby risk eventual human extinction, is a lot lower than to kill all humans directly.
In addition, this hard power approach is likely to be assisted by soft power - by narratives. Historian Yuval Harari wrote in a great piece for The Economist: "People may wage entire wars, killing others and willing to be killed themselves, because of their belief in this or that illusion. The AI revolution is bringing us face to face with Descartes’ demon, with Plato’s cave, with the Maya. If we are not careful, we might be trapped behind a curtain of illusions, which we could not tear away—or even realise is there." In an actual loss of control scenario, AI might well be able to hack our social and traditional media, and manipulate our beliefs in such a way that we might not resist at all. Also, AI might hack banks or cryptocurrency exchanges and simply hire people to do any necessary tasks.
Of course this is all speculative, and number crunching would do little to either prove or disprove such a scenario. This is an inherently frustrating property of AI existential risk: it hinges on whether one thinks a particular scenario is plausible or not. The RAND report, however, omits such a big part of why leading thinkers, including the most-cited AI scientists of the world, think that AI could cause human extinction, that its main conclusion, "It turns out it is very hard for AI to kill us all", is vastly overconfident and does not really add anything meaningful to the existing debate.
Does that mean there is nothing of interest in this report? Not quite. As is often the case, the full report sends quite a more nuanced message than the short commentary. In this case, although the latter is sceptical of AI's existential risks, there is plenty of reason for concern in the report itself. For example, out of the three ways discussed in which AI could use nuclear weapons, two ("deception and disinformation used to influence key individuals with authority to use nuclear weapons" and "gain unauthorized control over nuclear systems") are deemed serious risks. The authors think these are not existential only because "we have concluded that a single nuclear winter is unlikely to be an extinction risk". Hardly comforting and, as argued above, existentially unconvincing. Whether existential or not, it seems clear that these AI risks urgently need to be managed and a "wait and see"-approach, as the authors propose, would be quite irresponsible.
On the biological pathogens scenario, the authors state: "it is plausible that a variety of human actors could perform [the steps necessary] to successfully carry out an attack using a novel pathogen in the future." The authors do think "prevalence" of "generally capable robots" would be required (which may not actually be that far off), as opposed to simply hiring or otherwise convincing humans to do the dirty work. All in all, this scenario, too, does seem to bring huge risks that may be inherent in the development of future, more capable AI.
Apart from these insights, we also like that the report correctly focuses on AI's capabilities, not on alignment. We agree that this is the more meaningful debate, since when AI's capabilities are sufficient for loss of control, the existential risk will be manifestly present, either because of misalignment or because of bad actor risk.
All in all, although there are useful insights in On the Extinction Risk from Artificial Intelligence, the report and especially the blog post unfortunately reveal critical thinking mistakes that invalidate the conclusion. RAND should redo its analysis on the topic, update its conclusions, and communicate these updates to the public. Because, as actual existential risk researchers know: yes, AI may well cause human extinction.
Discuss