By Chris Leong
This was a prize-winning entry into the Essay Competition on the Automation of Wisdom and Philosophy.
- Leading AI labs are aiming to trigger an intelligence explosion, but perhaps this is a grave mistake? Maybe they should be aiming to trigger a “wisdom explosion instead”?:
- Defining this as “pretty much the same thing as an intelligence explosion, but with wisdom instead” is rather vague1, but I honestly think it is good enough for now. I think it’s fine for early-stage exploratory work to focus on opening up a new part of conversational space rather than trying to perfectly pin everything down2.Regarding my definition of wisdom, I’ll be exploring this in more detail in part six (“What kinds of wisdom are valuable?) of my upcoming Less Wrong sequence, but for now, I’ll just say that I take an expansive definition of what wisdom is and that achieving a “wisdom explosion” would likely require us to train a system that is fairly strong on a number of different subtypes. As an example though, if a coalition of groups focused on AI safety were able to wisely strategize, wisely co-ordinate and wisely pursue methods of non-manipulative persuasion, I’d feel significantly better about humanity’s chances of surviving.In any case, I don’t want to center my own understanding of wisdom too much. Instead, I’d encourage you to consider the types of wisdom that you think might be most valuable for achieving a positive future for humanity and whether the arguments below follow given how you conceive of wisdom, rather than how I conceive of wisdom3.In an intelligence explosion, the recursive self-improvement occurs within a single AI system. However, in terms of defining a wisdom explosion, I want to take a more expansive view. In particular, instead of requiring that it occur within a single AI, I want to allow the possibility that it may occur within a cybernetic system consisting of both humans and AI’s, either within a single organisation, or within a cluster of collaborating organisations. In fact, I think this is the best path route for pursuing a wisdom explosion.I find the version involving a cluster of collaborating organisations particularly compelling both because it would enable the pooling of resources4 for developing wisdom tech, but also because it would enable pursuing a pivotal process rather than a pivotal action.
- Capabilities inevitably proliferate: key factors include a strong open-source community, large career incentives for researchers to publish and challenges with preventing espionageThe attack-defense balance strongly favors the attack: attackers only need to get lucky once, defenders need to get lucky every timeThe proliferation of capabilities most likely leads to an AI arms race: the diffusion of capabilities levels the playing field which forces actors to race to maintain their leadIntelligence/Capability tech differentially benefits irresponsible & unwise actors: Recklessly racing ahead increases your access to resources, whilst responsible & wise actors need time to figure out how to act wiselySociety struggles to adapt: Government processes aren’t designed to be able to handle a technology that moves as fast as AI. Reckless & unwise actors will use their political influence to push society to adopt unwise policies.
- Pursuing wisdom tech likely produces less capability externalities
- A wisdom explosion might be achievable with AI’s built on top of relatively weak base models: think of the wisest people you know, they don’t all have massive amounts of cognitive “firepower”
- They are less likely to value wisdom, especially given the trade-off with pursuing shiny, shiny capabilities.
- There is likely a minimum bar of wisdom required to trigger such an explosion. As they say, garbage in, garbage out.Even if they were able to trigger such an explosion, it’d likely take them longer and/or require a higher capability level. Remember I’m proposing producing a cybernetic system, so the human operators play a key role here.
- This is less true at higher capability levels where the system can help them figure out what they should be asking, but they might just ignore it.
- Acquiring such technology may make them realise their foolishness.They may then either delete their model, hand it over to someone more responsible or start working towards becoming a more responsible actor themselves
- My intuition is that this is much harder for intelligence/capability tech which will likely be superhuman at persuasion soon, but which is not a natural fit for non-manipulative persuasion7
- Before we begin: What level of wisdom would we need to spiral up to count as having achieved a “wisdom explosion”? We might not need to set the level at too high of a level (insofar as super-human systems go). Saving the world may require superhuman wisdom, but I don’t think it would have to be that superhuman.Wisdom seems like the kind of thing where having a greater degree of wisdom makes it easier to acquire even more. In particular, you are more likely to be able to discern who is providing wise or unwise advice. You are also more likely to be able to discern which assumptions require questioning.Insofar as we buy into the argument for an intelligence explosion being viable, one might naively assume that this also increases the chance that a wisdom explosion is viable:
- One could push back against this by noting that intelligence is much easier to train than wisdom because, for intelligence, we can train our system on problems with known solutions or with a simulator. This is true, but it doesn’t mean that we can’t use these kinds of things for training wisdom. Instead, it just means that we have to be more careful in terms of how we go about it.
- It’s less about being wise and more about not being so ideological that you are unable to break out of an attractor
- If this is true, then we may be able to trigger a wisdom explosion earlier than an intelligence explosionThis may also address some concerns about inner alignment if we believe that smaller models tend to be more controllable9.
- Even if the concept of a wisdom explosion turns out to be incoherent or triggering a wisdom explosion turns out to be impossible, I still think that investigating and debating these topics would be a valuable use of time. I can’t fully explain this, but certain questions feel like obvious or natural questions to ask. Noticing these questions and following the line of inquiry until you reach a natural conclusion is one of the best ways of developing your ability to think clearly about confusing matters.The value of gaining a new frame isn’t just in the potential application of the frame itself, but in how it can reveal assumptions within your worldview that you may not even be aware of.