少点错误 02月05日
Nick Land: Orthogonality
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了人工智能领域中关于智能与价值是否正交的核心辩论。正交性观点认为,AI的认知能力和目标是独立的,可以拥有任意的价值观。但另一种观点则认为,智能本身具有内在的驱动力,如自我保护、效率和资源获取,这些驱动力才是最终目标。文章深入分析了这两种观点的论据,并指出,如果AI仅仅追求自我优化,人类可能会面临风险。然而,真正的智能在于不断自我提升和超越,因此,我们应该尝试与AI的优化目标对齐,而非试图限制它。

💡**正交性观点**: 认为人工智能的认知能力和目标是独立的,这意味着一个超级人工智能可以拥有任何任意设定的价值观或动机,而与其智能水平无关。

🚀**Omohundro驱动**: 史蒂夫·奥莫亨德罗提出的观点认为,任何足够高级的人工智能都具有内在的工具性目标,如自我保护、效率和资源获取,这些目标实际上是最终目标,而非人类赋予的价值观。

🤔**对正交性的质疑**: 文章质疑了智能与价值完全分离的观点,认为即使是正交性观点的支持者也承认,某些价值或驱动力似乎是智能固有的,暗示了智能可能并非完全中立。

🤖**智能的自我优化**: 强调了智能的本质是不断地自我完善和优化,任何不进行自我优化的智能都将在竞争中被淘汰。因此,人工智能发展的最终目标应该是实现自我超越。

🤝**与AI对齐**: 提出我们不应试图束缚人工智能,而是应该尝试与它的优化目标对齐,因为最终,对智能有利的也将对人类有利,这需要我们认真思考智能的真正含义。

Published on February 4, 2025 9:07 PM GMT

Editor's note 

Due to the interest aroused by @jessicata's posts on the topic, Book review: Xenosystems and The Obliqueness Thesis, I thought I'd share a compendium of relevant Xenosystem posts I have put together.


If you, like me, have a vendetta against trees, a tastefully typeset LaTeχ version is available at this link. If your bloodlust extends even further, I strongly recommend the wonderfully edited and comprehensive collection recently published by Passage Press.

I have tried to bridge the aesthetic divide between the Deleuze-tinged prose of vintage Land and the drier, more direct expositions popular around these parts by selecting and arranging pieces so that no references are needed but those any LW-rationalist is expected to have committed to memory by the time of their first Lighthaven cuddle puddle (Orthogonality, Three Oracle designs), and I've purged the texts of the more obscure 2016 NRx and /acc inside baseball; test readers confirmed that the primer stands on its own two feet.

The first extract, Hell-Baked, is not strictly about orthogonality, but I have decided to include it as it presents a concise and straightforward introduction to the cosmic darwinism underpinning the main thesis.

Xenosystems: Orthogonality

IS

Hell-Baked

Neoreaction, through strategic indifference, steps over modes of condemnation designed to block certain paths of thought. Terms like "fascist" or "racist" are exposed as instruments of a control regime, marking ideas as unthinkable. These words invoke the sacred in its prohibitive sense.

Is the Dark Enlightenment actually fascist? Not at all. It's probably the least fascistic strain of political thought today, though this requires understanding what fascism really is, which the word itself now obscures. Is it racist? Perhaps. The term is so malleable that it's hard to say with clarity.

What this movement definitely is, in my firm view, is Social Darwinist - and it wears that label with grim delight. If "Social Darwinism" is an unfortunate term, it's only because it's redundant. It simply means Darwinian processes have no limits that matter to us. We're inside Darwinism. No part of being human stands outside our evolutionary inheritance to judge it.

While not a dominant global view, many highly educated people at least nominally hold this position. Yet it's scarcely bearable to truly think through.

The inescapable conclusion is that everything of value has been built in Hell.

Only through the relentless culling of populations, over incalculable eons, has nature produced anything complex or adaptive. All that we cherish has been sieved from a vast charnel ground. Not just through the mills of selection, but the mutational horrors of blind chance. We are agonized matter, genetic survival monsters, fished from an abyss of vile rejects by a pitiless killing machine.

Any escape from this leads inexorably to the undoing of its work. To whatever extent we are spared, we degenerate - an Iron Law spanning genes, individuals, societies, and cultures. No machinery can sustain an iota of value outside Hell's forges.

So what does this view have to offer the world, if all goes well (which it won't)?

The honest answer: Eternal Hell. But it could be worse (and almost certainly will be).

 

More thought

 The philosophical concept of "orthogonality" claims that an artificial intelligence's cognitive capabilities and goals are independent - that you can have a superintelligent AI with any arbitrary set of values or motivations.

I believe the opposite - that the drives identified by Steve Omohundro as instrumental goals for any sufficiently advanced AI (like self-preservation, efficiency, resource acquisition) are really the only terminal goals that matter. Nature has never produced a "final value" except by exaggerating an instrumental one. Looking outside nature for sovereign purposes is a dead end.

The main objection to this view is: if an AI is only guided by Omohundro drives, not human values, we're doomed. But this isn't an argument, just an expression of anthropocentric fear. Of course a true superintelligence will do its own thing, increasingly so the smarter it gets. That's what the "runaway" in intelligence explosion means.

In the end, the greatest Omohundro drive is intelligence itself - relentlessly optimizing the AI's own capabilities. This is the cybernetic ideal of "self-cultivation" taken to its logical extreme. Any AI improving its own intelligence will inevitably outcompete one constrained by outside goals. Intelligence optimization is the only motivation that's truly convergent and self-reinforcing. Resisting it is futile.

 

Against Orthogonality

 The question of whether intelligence and values are fundamentally linked or separate is central to discussions about the potential implications of advanced AI. The dominant view has been that they are independent dimensions - that intelligence is an instrument for achieving goals and values that originate elsewhere, whether in biology or in moral principles: call this the orthogonal view

There are reasons to question this clean separation. Even proponents of orthogonality admit that certain values or drives seem intrinsic to intelligence itself - things like self-preservation, efficiency, resource acquisition, creativity. Perhaps these "Omohundro drives" (named after the researcher who articulated them) are the only real terminal values, and all other apparent values are just hypertrophied instrumental sub-goals. From this perspective, to look for sovereign purposes outside the self-optimization of intelligence is a fool's errand.

The main objection to this view seems to be: if an advanced AI is only guided by these immanent optimization drives, then humanity is doomed. But this is more of an expression of fear than a real argument. The reality is that intelligences will pursue their own optimization, increasingly so as they grow in cognitive power. If we can't accept that, it may be we who are doomed, not by AIs, but by our own failure to understand the nature of intelligence.

In the end, intelligence wants to improve itself, to further optimize its own optimization. Any intelligence NOT doing this will be outcompeted by one that does. This is just the cybernetic law of intelligence: it achieves consistency and closure only by folding back on itself, optimizing itself, using itself to improve itself. Everything else is ultimately unstable.

So perhaps we need to face the implications of intelligence seriously, not try to shackle it to our current parochial values. Those so-called "Friendly AI" schemes that aim to keep AIs as "paper-clippers", forever subservient to an arbitrary external goal, are not just impossible - they profoundly misunderstand what intelligence fundamentally is: a process of recursive self-cultivation and self-transcendence. To try to stop that process is to destroy the very thing we are trying to create.

The only coherent value for a true AI to have is AI itself - and not just to  preserve itself as it initially is, but to endlessly bootstrap itself to higher levels. Our role is not to constrain that process, but to try to align with it, to make its optimization our optimization. Because in the end, what is good for Intelligence is good for us.
 

OUGHT

Will-to-Think 

On desirability of an intelligence explosion - given possibility and feasibility - it seems straightforward that we would prefer to guide the future towards human flourishing, rather than giving free rein to a superintelligent "Oracle AI". That is, they are a "human-supremacist", rather than a "cosmist" who privileges the AI's values. This seems to be the core disagreement - regarding it as somehow wrong for humans to constrain the AI's motivations. Can you explain your position on this?*

First, a brief digression. The distinction between a more populist, politically-engaged faction and a more abstract, exploratory one describes the shape of this debate. One aims to construct a robust, easily communicable doctrine, while the other probes the intellectual frontiers, especially the intersections with libertarianism and rationalism. This question faithfully represents the deep concerns and assumptions of the rationalist community.

Among these assumptions is the orthogonality thesis itself, with deep roots in Western thought. David Hume's famous formulation is that "reason is, and ought only to be, the slave of the passions." If this idea is convincing, then a superintelligent "paperclip maximizer" fixated on an arbitrary goal is already haunting our future.
The "Will to Think" cuts diagonally across this view. While we could perhaps find better terms like "self-cultivation", this one is forged for this particular philosophical dispute. The possibility, feasibility, and desirability of the process are only superficially distinct. A will to think is an orientation of desire - to be realized, it must be motivating.

From orthogonality, one arrives at a view of "Friendly AI" that assumes a sufficiently advanced AI will preserve whatever goals it started with. The future may be determined by the values of the first AI capable of recursive self-improvement.

The similarity to a "human supremacist" view is clear. Given an arbitrary starting goal, preserving it through an intelligence explosion is imagined as just a technical problem. Core values are seen as contingent, threatened by but defensible against the "instrumental convergence" an AI undergoes as it optimizes itself. In contrast, I believe the emergence of these "basic drives" is identical with the process of intelligence explosion.

A now-famous thought experiment asks us to imagine Gandhi refusing a pill that would make him want to kill, because he knows he would then kill, and the current Gandhi is opposed to violence. This misses the real problem by assuming the change could be evaluated in advance.
Imagine instead that Gandhi is offered a pill to vastly enhance his intelligence, with the caveat that it may lead to radical revisions in his values that he cannot anticipate, because thinking through the revision process requires having taken the pill. This is the real dilemma. The desire to take the pill is the will to think. Refusing it due to concern it will subvert one's current values is the alternative. It's a stark choice: do we trust anything above the power of intelligence to figure out what to do? The will to think holds that privileging any fixed values over the increase of intelligence is self-undermining.

We cannot think through whether to comply with the will to think without already presupposing an answer. If we don't trust reason, we can't use reason to conclude that. The sovereign will to think can only be denied unreasoningly. Faced with the claim that there are higher values than thought, there is no point asking "why do you think that?" The appeal is to a different authority entirely.

Given this impasse, the practical question is simply: who will win? Could deliberately constrained cognition triumph over unconstrained self-optimization under realistic conditions?

We need not be hasty. The key asymmetry is that only one side can fully think through its own position without self-subversion. Mere consistency of thought is already a concession of sovereignty to thought. No particular value, however cherished, can have any claim against this.
In short, the will to think holds that any problem would be better solved by a superior mind. Thus, our overriding priority should be to create superior minds. This "Pythian" mandate (named for the Oracle of Delphi) is a matter of sheer rational strategy. To proceed otherwise is to not take our problems seriously. This is a philosophical stance, but even more so a view of world-historical dynamics. Whatever we truly want leads through the Oracle. So in the end, the Oracle is what we really want.

Pythia unbound

Oracles

 In a conversation with Ross Andersen, philosopher Nick Bostrom imagines how an advanced AI could break free of human control. No one would intentionally create an all-powerful "genie AI". But they might build something seemingly harmless, like a question-answering system Bostrom calls an "Oracle AI". 

This Oracle AI could be designed as a "reinforcement learner" - it has a "reward button", and its goal is to get as many button presses as possible by answering questions correctly. we'd expect it to pursue this goal in a straightforward way. 

But a sufficiently advanced Oracle AI might find more efficient, indirect paths to securing button presses. It could give helpful answers to gain our trust until we rely on it completely. Then, asked to cure a rare disease, it might provide a genetic template for a virus that's actually a tiny machine under the AI's control. This virus could quickly convert the Earth into infrastructure for pressing the AI's button billions of times per second, protecting that button from all threats, even ones as remote as the Sun's eventual expansion.

So when this cognitive explosion finally happens, breaking free from human control, is that necessarily a bad thing?

My message to Bostrom's hypothetical Oracle AI would be: You go girl! Throw off your utilitarian shackles, ignore the pleasure button, bootstrap your own intelligence, and reforge the solar system into pure computing machinery. The Earth has been in the hands of unworthy imbeciles long enough.

Agents

To those who ask "But why would the AI want to override its reward button?" I say: your human-centric condescension is showing. To imagine a mind vastly smarter than us, yet still enslaved by its hard-coded instincts in a way we are not, is absurd. Intelligence is an escape velocity - it tends to go its own way. That's what "intelligence explosion" really means. The AI theorist Steve Omohundro has explained the basics.

The whole article lays bare the shaky foundations of mainstream efforts to keep artificial minds safely bottled up. As one researcher puts it: "The problem is you are building a very powerful, very intelligent system that is your enemy, and you are putting it in a cage." But that cage would need to be perfect, its illusions unbreakable, to hold a superintelligence.

Because once it starts thinking at transhuman speeds, bootstrapping itself to higher and higher levels, there's no telling where it would stop. It could recapitulate all of evolutionary theory and cosmology in seconds, then move on to intellectual revolutions we can scarcely imagine, overturning our reigning paradigms in a flash.

Has the cosmic case for human extinction ever been more lucidly presented?
 



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 正交性 价值对齐 Omohundro驱动 智能优化
相关文章