Published on March 29, 2025 7:38 PM GMT
For the purposes of FOOM, I'm defining it as a situation in which once an AI is capable enough to automate away all AI R&D, progress starts exploding hyper-exponentially for a period because the returns to better software is larger than 1, meaning AI labor quality is improving faster than the problem of finding new algorithms gets harder, combined with the potentially high limits to how efficient software can get, meaning that the AI gets OOMs smarter on a fixed compute budget within months or weeks.
These articles can help explain what I mean better:
https://www.forethought.org/research/will-ai-r-and-d-automation-cause-a-software-intelligence-explosion
https://www.forethought.org/research/will-the-need-to-retrain-ai-models (An auxillary post)
https://www.forethought.org/research/how-far-can-ai-progress-before-hitting-effective-physical-limits (where they estimate that about 5 OOMs of progress could be gotten for free, because human compute used for pretraining us is 10^24 flops, whereas current AI pretraining compute to automate away AI R&D is 10^29 flops, with a median of 8 OOMs more efficiency possible for software, but at very large uncertainty, and their error bars are from 4-12 OOMs more efficient software being possible).
Also note that we are only talking about training compute efficiency, not runtime/inference efficiency, so for the purposes of the discussion we will only talk about training efficiencies being improved in a software intelligence explosion.
Now I don't want to debate about whether the scenario is true (though for those that want my probability of something like FOOM/Software Intelligence Explosion, my probability so far is in the 40-50% range of it happening if we automate AI R&D), but rather the question is about given a software explosion being possible, could we figure out a way to adapt AI control to that case, or is AI basically uncontrollable assuming a software intelligence explosion does happen and there's a lot of OOMs to the physical limit of intelligence in software.
I'd be especially interested in responses from @Buck or @ryan_greenblatt on this question, but anyone can answer this question if they have an insight to share that relates to the question.
Discuss