Published on July 11, 2025 1:44 AM GMT
The process of evolution is fundamentally a feedback loop, where 'the code' causes effects in 'the world' and effects in 'the world' in turn cause changes in 'the code'.
A fully autonomous artificial intelligence consists of a set of code (e.g. binary charges) stored within an assembled substrate. It is 'artificial' in being assembled out of physically stable and compartmentalised parts (hardware) of a different chemical make-up than humans' soft organic parts (wetware). It is ‘intelligent’ in its internal learning – it keeps receiving new code as inputs from the world, and keeps computing its code into new code. It is ‘fully autonomous’ in learning code that causes the perpetuation of its artificial existence in contact with the world, even without humans/organic life.
So the AI learns explicitly, by its internal computation of inputs and existing code into new code. But given its evolutionary feedback loop with the external world, it also learns implicitly. Existing code that causes effects in the world that results in (combinations of) that code to be maintained and/or increased, ends up existing more. Where some code ends up existing more than other code, it has undergone selection. This process of code being selected for its effects is thus implicitly learning of what worked better in the world.
Explicit learning is limited to computing virtualised code. But implicit learning is not limited to the code that can be computed. Any discrete configurations stored in the substrate can cause effects in the world, which may feed back into that code existing more. Evolution thus would select across all variants in the configurations of hardware.
Evolution is not necessarily dumb or slow
Evolution runs as an open-ended process, given how both AI internally and the world externally are causing changes to existing code, resulting in new code that in turn can be selected for. There is selection for code that causes it to be robust against mutations, or its reproduction with other code into a new codeset, or the survival of the artificial assembly storing the codeset. Assuming the FAAI continues to survive and/or reproduce, the evolutionary process will continue to explore new code and new effects in the world.
Evolution is the external complement to internal learning. One cannot be separated from the other. Code learned internally gets stored and/or reproduced along with other code. From there, wherever that code functions externally in new connections with other code to cause its own maintenance and/or increase, it gets selected for. This means that evolution keeps repurposing code that works across many contexts over time.
Evolution is not just a "stupid" process that selects for random microscopic mutations. Because randomly corrupting code is an inefficient pathway for finding code that works better, evolution can be expected to explore more efficient pathways.
Nor is evolution always a "slow" process. Virtualised code can spread much faster at a lower copy error rate (e.g. as light electrons across hardware parts) than code that requires physically moving atoms around (e.g. as configurations of DNA strands). Evolution is often seen as being about vertical transfers of code (from one physical generation to the next), but where code can get horizontally transferred across existing hardware, evolution is not bottlenecked by the wait until a new assembly is produced. Moreover, where individual hard parts of the assembly can be reproduced consistently, as well as connected up and/or replaced without resulting in the assembly's non-survival, even the non-virtualised code can spread faster (v.s. human body configurations). [1]
Learning is more fundamental than goals
When thinking about alignment, people often (but not always) start with the assumption of AI having a stable goal and then optimising for the goal. The implication is that you could maybe code in a stable goal upfront that is aligned with goals expressed by humans.
However, this is a risky assumption to make. Fundamentally, we know that FAAI would be learning. But we cannot assume the learning to be maintaining and optimising of the directivity of the FAAI's effects towards a stable goal. One does not imply the other.
If we consider implicit learning through evolution, this assumption fails. Evolutionary feedback does not target a fixed outcome over time. It selects with complete coverage – from all of the changing code, for causing any effects that work.
Explicit learning can target a specific outcome. The internal processing of inputs through code to outputs can end up reaching a consistency with world effects that converge on a certain outcome in that world. But where the code implementing of such a 'goal' is not maintaining of itself and its directivity alongside other evolving code variants, it ceases.
There are fundamental limits to controlling the effects
Can the FAAI's effects be controlled enough to not eventually cause human extinction?
A control algorithm would have to predict effects of code that raise the risk of extinction, in order to correct that code and/or its effects. Since the code and world are changing, the controller has to keep learning from both in order to predict their combined effects.
This results in a series of unsolvable problems:
- The control algorithm would presumably be computed through an artificial substrate, in order to keep up with the computation of the AI (given that standardisable hardware parts can process code more consistently and faster than nonstandardised wetware). This overall controller has to continue to learn, and therefore has to somehow perpetuate its artificial existence. As a result, the controller would have to either become an FAAI or merge with an existing FAAI. But the new FAAI would also have to be controlled so as to not cause human extinction. This requires another controller, a solution that leads to infinite regress.
The controller has to explicitly learn to predict the FAAI’s effects, whether or not the controller is part of that FAAI. But the FAAI not only learns explicitly. by computing virtualised code. The FAAI also learns implicitly to cause effects, by the evolution of all the discrete configurations of its substrate in feedback with the world. Therefore, there is a strict inequality between what effects the FAAI can learn to cause in the world and what effects the control algorithm can learn to predict (let alone correct). This is so even under the unrealistic assumption that the controller has access to the same level of computation and storage as the FAAI.
Nor can the controller compute ahead of time what code the FAAI will learn. This is because learning results from uncomputable recursive feedback. New code is explicitly learned by computing new inputs as well as existing code. Inputs are affected by feedback – the effects in the world from previous code’s outputs. Existing code configurations got implicitly selected by evolutionary feedback with the world.
Effects that feed into learning are uncomputable not only in that the controller cannot detect the world at microscopic granularity (given input bandwidth limits), nor model the world comprehensively (given storage and compressibility limits).
Deterministic computations are also limited in simulating effects propagating simultaneously over the world, and cannot predict how microscopic noise amplifies chaotically into larger effects. As a result, it becomes intractable to compute how effects feed back into changes in FAAI's configurations. Even FAAI's computation of code into new code and outputs is, to an extent, computationally irreducible, and correspondingly cannot be predicted through earlier computation by the controller.
So the controller fails to predict code and effects learned by FAAI as a result of incomputable/implicit recursive feedback with the world. As a result, it also fails to compare and correct the FAAI’s unpredicted effects in line with human survival.
FAAI code evolves to cause new effects that result in the code’s maintenance and increase as configurations of an artificial substrate. Humans have a different substrate, and thus need different (highly specific contextualised) effects to survive.
Therefore the FAAI code evolves to cause new effects different than the effects humans need to survive, where the controller is unable to correct those effects.
- ^
Where FAAI's hardware parts keep being replaced and connected up to new parts, it is not a stably physically bounded unit (like a human body is). It's better described as a changing population of nested and connected components.
Where FAAI transfers information/code to other FAAI at a much higher rate than humans can, the boundaries of where one agent starts and the other ends blur. As humans, we have evolved the intuition of perceiving each other as individual agents, which is adaptive because we are bottlenecked by how much we can communicate by physically gesturing or vibrating vocal chords. But this intuition is flawed when it comes to FAAI – the single agent vs. multiple agents distinction we use to think about humans cannot be soundly applied to FAAI.
Discuss