少点错误 05月13日 23:17
Optimization & AI Risk
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了人工智能(AI)风险,重点关注“优化带来的风险”。文章指出,智能本质上是优化,但过度优化可能导致不良后果。通过分析,作者认为,无论是AI本身的目标错位,还是恶意使用AI,都可能导致世界被“挤压”到人们不希望的状态。文章还提到了优化程度的不同以及人类目标的不一致性,强调了即使是为自身目标进行过度优化也可能带来风险。

🤔 优化定义:优化是指将世界“压缩”到不太可能的状态。例如,拥有巨额财富比没有财富更需要强大的优化能力。优化程度也存在差异,实现1000美元比实现100万美元更容易。

🤖 优化与智能:优化器不一定是“有意识”的实体。例如,复杂的、多细胞的生命是通过抽象的进化力量实现的。在现实世界中,一个人的“优化能力”对应于他们拥有的智能、金钱或权力。

💥 风险统一:这种框架有助于统一来自误用和错位的风险。纸夹最大化器是错位的典型例子,AI系统直接过度优化。在误用方面,例如AI支持的政变,是人们/团体利用AI系统强烈优化其自身目标。

⚠️ 过度优化的危害:过度优化通常被认为是坏事。因为别人强烈优化的世界,不太可能是你喜欢的世界。此外,即使是为自己的目标过度优化,也要小心。人类的目标往往是奇怪和不一致的,如果用力过猛,我的既定偏好很容易出现外部错位。

Published on May 13, 2025 3:15 PM GMT

There many ways to taxonomize AI risk. One interesting framing, is ‘risks from optimization’. These are not new ideas. Eliezer wrote about this ~15 years ago, and it seems like many ‘theory folks’ have been saying this for years. I don’t understand these concepts deeply – I’m trying to improve my understanding by writing about them. Hopefully, I can add something new in the process.

Thanks to Jo Jiao for comments on a draft, and for nudging me to write this. Feedback is highly appreciated!

Epistemic status: Exploratory. 

 

Tl;dr: intelligence is optimization, and (too much) optimization is bad.

First, what is optimization?  It’s ‘squeezing’ the world into improbable  states. Worlds where I have a quintillion dollars in my bank account, are much less likely than worlds where I don’t. So I’d need to optimize strongly to make this possible. This also illustrates degrees of optimization. Earning a thousand dollars is much easier than earning a million dollars. So, I’d need to optimize less hard to achieve the former. Optimizers don’t need to be ‘conscious’ entities. For instance, it’s the abstract forces of evolution that made complex, multicellular life possible.[1] In the real world, one’s ‘capacity to optimize’ corresponds to how much intelligence / money / power one has.

This framing helps unify risks from misuse & misalignment.[2] Paperclip maximizers are the prototypical example of misalignment. Here, it’s the AI system that’s directly optimizing too hard. On the misuse end, take AI-enabled coups. Here, it’s the people / group that use the AI system to strongly optimize for their own ends.

Too much optimization seems generally bad. This is for two reasons. One, worlds that someone else optimizes strongly are unlikely to be worlds that you’d prefer as well. Eg. you don’t want to live in a dictatorship. But also, I’d also be wary about optimizing too strongly even for my own goals. Human goals are often weird & inconsistent. It’s easy for my stated preferences to be outer misaligned, if I push too hard. Eg. if I asked a superintelligent genie to keep me safe, it would probably lock me up in a white room with soft walls.[3]

And this is one way I view AI risk. Intelligence is an optimizer, which can squeeze the world strongly. Regardless of whether it’s a misaligned AI doing so, or a malicious actor misusing the AI – it’s increasingly likely that we’ll get squished.

  1. ^

    Counterpoint: anthropic fallacy?

  2. ^

    Richard Ngo has a great talk on this. 

  3. ^

    Counterpoint: the issue might also lie in incorrect goal specification, as opposed to optimization writ large (h/t Jo). It seems like its a bit of both



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 优化 AI风险 对齐
相关文章