Published on June 29, 2025 7:52 PM GMT
Here, I track my evolving thoughts on what remains on the path to building generally-intelligent agents. Why does this matter? Three compelling reasons:
1. Top-down view: AI research papers (and product releases) move bottom-up, starting from what we have right now and incrementally improving, in the hope we eventually converge to the end-goal. This is good, that’s how concrete progress happens. At the same time, to direct our efforts, it is important to have a top-down view of what we have achieved, and what are the remaining bottlenecks towards the end-goal. Besides, known unknowns are better than unknown unknowns.
2. Research prioritisation: I want this post to serve as a personal compass, reminding me which capabilities I believe are most critical for achieving generally intelligent agents—capabilities we haven't yet figured out. I suspect companies have internal roadmaps for this, but it’s good to also discuss this in the open.
3. Forecasting AI Progress: Recently, there is much debate about the pace of AI advancement, and for good measure—this question deserves deep consideration. Generally-intelligent agents will be transformative, requiring both policymakers and society to prepare accordingly. Unfortunately, I think AI progress is NOT a smooth exponential that we can extrapolate to make predictions. Instead, the field moves by shattering one (or more) wall(s) every time a new capability gets unlocked. These breakthroughs present themselves as large increases in benchmark performance in a short period of time, but the absolute performance jump on a benchmark provides little information about when the next breakthrough will occur. This is because, for any given capability, it is hard to predict when we will know how to make a model learn it. But it’s still useful to know what capabilities are important and what kinds of breakthroughs are needed to achieve them, so we can form our own views about when to expect a capability. This is why this post is structured as a countdown of capabilities, which as we build out, will get us to “AGI” as I think about it.
Framework
To be able to work backwards from the end-goal, I think it’s important to use accurate nomenclature to intuitively define the end-goal. This is why I’m using the term generally-intelligent agents. I think it encapsulates the three qualities we want from “AGI”:
Generality: Be useful for as many tasks and fields as possible.
Intelligence: Learn new skills from as few experiences as possible
Agency: Planning and performing a long chain of actions.
This post will be made in two parts. In this first part, I will discuss the frontier—capabilities needed to achieve general agents which we are already seeing progress towards. In the follow-up to be released later, I will cover the future—the remaining capabilities needed to add intelligence, which might take longer. I will skip discussions of more modalities (vision, audio etc.), and safety, which I think are extremely important, but beyond the scope of this post.
I used the more popular term “AGI” in the title as its a handy, recognisable short-hand for these ideas. But it’s also overloaded. Some definitions of it might already be achieved. Others are not concrete enough to work backwards from. So I will avoid it for the rest of the post. I also dislike the term “ASI” (Artificial Superintelligence). It leaves me wondering, super in what way, and to what? Often people mean better than humans. But why should that be the end-goal? First of all, it is ill-defined—different humans vary widely in their capabilities. Second, computers are already superhuman in so many ways. They already store more knowledge than any single human, with modern LLMs offering superhuman knowledge retrieval to any natural language query. Computational search is also better at optimising any programmatically specifiable task (such as fitting a curve). I think we can achieve superhuman performance on any capability. There is no reason to believe humans are optimal. We are just one instance of generally intelligent agents, and there is no reasons why we cannot create better ones. Besides, what’s easy for humans might not be for AI (motorphysical control), and vice-versa (breadth of knowledge). This is another reason to think about AI progress as a basket of capabilities, and measuring performance on each of these.
…. AI 2024 - Generality of Knowledge
Part I on The Frontier: General Agents
…. Reasoning: Algorithmic vs Bayesian
…. Information Seeking
…. Tool-use
…. Towards year-long action horizons
…. …. Long-horizon Input: The Need for Memory
…. …. Long-horizon Output
…. Multi-agent systems
Part II on The Future: Generally-Intelligent Agents [TBA]
Discuss