LLMs as a Planning Overhang

Published on July 14, 2024 2:54 AM GMT

It's quite possible someone has already argued this, but I thought I should share just in case not.

Goal-Optimisers and Planner-Simulators

When people in the past discussed worries about AI development, this was often about AI agents - AIs that had goals they were attempting to achieve, objective functions they were trying to maximise. At the beginning we would make fairly low-intelligence agents, which were not very good at achieving things, and then over time we would make them more and more intelligent. At some point around human-level they would start to take-off, because humans are approximately intelligent enough to self-improve, and this would be much easier in silicon.

This does not seem to be exactly how things have turned out. We have AIs that are much better than humans at many things, such that if a human had these skills we would think they were extremely capable. And in particular LLMs are getting better at planning and forecasting, now beating many but not all people. But they remain worse than humans at other things, and most importantly the leading AIs do not seem to be particularly agentic - they do not have goals they are attempting to maximise, rather they are just trying to simulate what a helpful redditor would say.

What is the significance for existential risk?

Some people seem to think this contradicts AI risk worries. After all, ignoring anthropics, shouldn’t the presence of human-competitive AIs without problems be evidence against the risk of human-competitive AI?

I think this is not really the case, because you can take a lot of the traditional arguments and just substitute ‘agentic goal-maximising AIs, not just simulator-agents’ in wherever people said ‘AI’ and the argument still works. It seems like eventually people are going to make competent goal-directed agents, and at that point we will indeed have the problems of their exerting more optimisation power than humanity.

In fact it seems like these non-agentic AIs might make things worse, because the goal-maximisation agents will be able to use the non-agentic AIs.

Previously we might have hoped to have a period where we had goal-seeking agents that exerted influence on the world similar to a not-very-influential person, who was not very good at planning or understanding the world. But if they can query the forecasting-LLMs and planning-LLMs, as soon as the AI ‘wants’ something in the real world it seems like it will be much more able to get it.

So it seems like these planning/forecasting non-agentic AIs might represent a sort of planning-overhang, analogous to a Hardware Overhang. They don’t directly give us existentially-threatening AIs, but they provide an accelerant for when agentic-AIs do arrive.

How could we react to this?

One response would be to say that since agents are the dangerous thing, we should regulate/restrict/ban agentic AI development. In contrast, tool LLMs seem very useful and largely harmless, so we should promote them a lot and get a lot of value from them.

Unfortunately it seems like people are going to make AI agents anyway, because ML researchers love making things. So an alternative possible conclusion would be that we should actually try to accelerate agentic AI research as much as possible, because eventually we are going to have influential AI maximisers, and we want them to occur before the forecasting/planning overhang (and the hardware overhang) get too large.

I think this also makes some contemporary safety/alignment work look less useful. If you are making our tools work better, perhaps by understanding their internal working better, you are also making them work better for the future AI maximisers who will be using them. Only if the safety/alignment work applies directly to the future maximiser AIs (for example, by allowing us to understand them) does it seem very advantageous to me.

Discuss

Goal-Optimisers and Planner-Simulators

What is the significance for existential risk?

How could we react to this?

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签