Low P(x-risk) as the Bailey for Low P(doom)

少点错误 16小时前

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

文章探讨了“末日风险”（existential risk, x-risk）的定义及其在实际讨论中的模糊性。作者指出，尼克·博斯特罗姆的定义涵盖了毁灭人类生命或永久性大幅限制其潜力的负面结果。然而，许多被视为“非末日”的场景，如仅局限于太阳系而未遍及宇宙，也可能符合x-risk的定义，导致人们对“末日”概率的低估。文章强调，低“末日”概率（P(doom)）不等于低x-risk，后者可能包含人类潜力被AI大幅削弱的广阔区间。作者建议避免使用含糊的“末日”一词，转而清晰界定对“永久性能力剥夺”（permanent disempowerment）的立场，以更准确地传达对未来风险的担忧。

💡 **“末日风险”定义广泛，易致理解偏差**：尼克·博斯特罗姆将“末日风险”（x-risk）定义为可能灭绝地球生命或永久性且剧烈地限制其潜力的负面事件。然而，许多不涉及完全灭绝但大幅限制人类发展潜力的场景，也被包含在x-risk的范畴内，这使得仅讨论“末日”概率（P(doom)）可能无法完全反映真实的风险程度，例如，仅局限于太阳系而失去对广阔宇宙的探索潜力，即便不是灭绝，也属于x-risk的范畴。

🚀 **低“末日”概率不等于低“存在性风险”**：文章指出，当人们讨论较低的“末日”概率（P(doom)）时，其真实意图往往是针对极端不利的后果。一个20%的“末日”概率，并不意味着剩余的80%不会包含人类潜力被永久性且剧烈限制的风险。这意味着，即使P(doom)很低，x-risk的概率（即人类潜力被大幅削弱的可能性）可能高达90%-98%。因此，在某些情况下，公开宣称低P(doom)可能是一种策略，以掩盖对高x-risk（即人类未来发展潜力受到极大限制）的潜在预期。

⚖️ **区分“灭绝”与“永久性能力剥夺”的重要性**：文章区分了“灭绝”、“末日”和“x-risk”这三个术语，并引入了“永久性能力剥夺”（permanent disempowerment）的概念，以描述介于灭绝和x-risk之间的多种负面结果。这些结果包括人类被以残酷或非人道的方式约束、改变，或被剥夺自主发展权。作者认为，这些情况与灭绝共同构成了极端糟糕的后果集合，但缺乏一个统一的词语来描述。因此，清晰界定对“永久性能力剥夺”的立场，比仅谈论灭绝或x-risk更能准确地传达核心观点。

🗣️ **建议使用更清晰的术语，避免“末日”**：鉴于“末日”一词的模糊性，作者建议避免使用。清晰地阐述对“永久性能力剥夺”的态度，是避免混淆的直接途径。文章强调，报告较低的灭绝风险或较高的x-risk时，如果未明确提及“永久性能力剥夺”，可能会给人们留下较大解释空间。相反，极高的灭绝风险或极低的x-risk则相对清晰，因为它们留给“永久性能力剥夺”的空间较小。

Published on July 29, 2025 6:01 PM GMT

Nick Bostrom defines existential risk as

Existential risk – One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.

The problem with talking about "doom" is that many worlds that fall to existential risk but don't involve literal extinction are treated as non-doom worlds. For example, leaving humanity the Solar System rather than a significant portion of the 4 billion galaxies reachable from the Solar System is plausibly a "non-doom" outcome, but it's solidly within Bostrom's definition of x-risk.

Thus when people are discussing P(doom), the intent is often to discuss only extreme downside outcomes, and so low P(doom) such as 20% doesn't imply that the remaining 80% don't involve permanently and drastically curtailed future of humanity. In other words, a P(doom) of 20% is perfectly compatible with P(x-risk) of 90-98%.

Ben Mann of Anthropic said in a recent interview (at 50:24 in the podcast) that his x-risk probability is 0-10%. If he really did mean Bostrom's x-risk rather than extinction risk (which "x-risk" isn't an abbreviation for), that seems like a relatively unusual claim.

Low P(doom) seems like a much more defensible position than low x-risk, and so low P(doom) might often be taking the role of motte to the bailey of low x-risk (I'm not sure how it could possibly be made defensible, if x-risk is taken in Bostrom's sense). Or one publicly claims low P(doom), but then implicitly expects high x-risk, meaning a high probability of drastically curtailed potential, of cosmic endowment being almost completely appropriated by AIs.

Between Extinction and Permanent Disempowerment

The three existing terms in recent use for extreme downside of superintelligence are extinction, doom, and x-risk. Permanent disempowerment (some level of notably reduced potential, but not extinction) covers a lot of outcomes between extinction and x-risk. Doom is annoyingly ambiguous between including only extreme levels of permanent disempowerment within itself, and including even very slight permanent disempowerment in the sense that some nontrivial portion of cosmic endowment goes to AIs that are not a good part of humanity's future.

Also there are worlds within x-risk that involve neither extinction nor permanent disempowerment, where originally-humans are constrained or changed in cruel and unusual ways, or not given sufficient autonomy over their own development. These outcomes together with extinction form a natural collection of extremely bad outcomes, but there is no word for them. In the framing of this post, they are the outcomes that are worse than mere permanent disempowerment. "Doom" would've worked to describe these outcomes if it wasn't so ambiguous and didn't occasionally include moderate permanent disempowerment within itself, in particular for people who want to publicly claim very high P(doom), this time making it the motte of their position that obscures the implicit expectation of extinction with only a relatively moderate probability such as 50%.

Avoid "Doom", Clarify Stance on Permanent Disempowerment

Meaningful use of "doom" seems hopeless. But clarifying your stance on permanent disempowerment seems like a straightforward recipe for disambiguating intended meaning. This doesn't risk obscuring something central to your position on the topic as much as only talking about extinction or x-risk, even though they are well-defined, because there could be a lot of permanent disempowerment outside of extinction or inside x-risk, or alternatively only a little bit.

If the permanent disempowerment outcomes are few, extinction gets much closer to x-risk, in which case doom is less ambiguous, but that's only visible if you name both extinction risk and x-risk, and remains extremely ambiguous if you only state a somewhat low extinction risk or a somewhat high x-risk. On the other hand, claims of very high extinction risk or claims of very low x-risk are unambiguous, because they leave little space for permanent disempowerment to hide in.

Discuss

Between Extinction and Permanent Disempowerment

Avoid "Doom", Clarify Stance on Permanent Disempowerment

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签