少点错误 2024年07月04日
Consider the humble rock (or: why the dumb thing kills you)
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文作者认为,人们在思考人工智能安全风险时,往往过度关注超级智能带来的威胁,而忽略了“笨蛋”造成的风险。作者以现实生活中人们对刀具攻击的过度担忧,以及对毒药的错误认知为例,说明了“最小可行”的威胁往往比那些“看起来很可怕”的威胁更具现实性。作者认为,在AI领域,我们也应该关注那些可能导致人类灭绝的“笨蛋”行为,例如生物武器泄漏、核武器意外引爆等等,因为它们发生的概率可能远高于超级智能的出现。

🤔 作者认为,人们在思考人工智能安全风险时,往往过度关注超级智能带来的威胁,而忽略了“笨蛋”造成的风险。作者以现实生活中人们对刀具攻击的过度担忧,以及对毒药的错误认知为例,说明了“最小可行”的威胁往往比那些“看起来很可怕”的威胁更具现实性。

💥 作者认为,在AI领域,我们也应该关注那些可能导致人类灭绝的“笨蛋”行为,例如生物武器泄漏、核武器意外引爆等等,因为它们发生的概率可能远高于超级智能的出现。作者认为,这类“笨蛋”行为可能更容易发生,因为它们不需要超级智能的“智慧”,只需要一些“偶然”的因素。

💡 作者还借鉴了“大过滤器”的概念,认为我们没有看到太多外星文明,也可能没有看到很多超级智能,因为它们可能已经被“笨蛋”行为所毁灭。作者认为,如果AI是“大过滤器”,那么我们应该关注那些可能导致AI失控,但又不具备超级智能的“笨蛋”行为。

🤔 作者最后指出,虽然“笨蛋”行为可能带来的风险更大,但人们往往更关注超级智能的威胁,因为超级智能更“可怕”。作者认为,这可能是因为人们的思维方式更倾向于关注“故事性”强的事物,而忽略了那些“平淡无奇”却更可能发生的风险。

🤔 作者还提出,虽然“笨蛋”行为可能带来的风险更大,但我们应该关注那些我们能够控制的风险,例如防止生物武器泄漏、防止核武器意外引爆等等。作者认为,虽然超级智能的风险更大,但我们对超级智能的控制能力有限,而对“笨蛋”行为的控制能力相对更大。

Published on July 4, 2024 1:54 PM GMT

When people think about street-fights and what they should do when they find themselves in the unfortunate position of being in one, they tend to stumble across a pretty concerning thought relatively early on: "What if my attacker has a knife?" . Then they will put loads of cognitive effort into strategies for how to deal with attackers wielding blades. On first glance this makes sense. Knives aren't that uncommon and they are very scary, so it feels pretty dignified to have prepared for such scenarios (I apologize if this anecdote is horribly unrelatable to Statesians). The issue is that –all in all– knife related injuries from brawls or random attacks aren't that common in most settings. Weapons of opportunity (a rock, a brick, a bottle, some piece of metal, anything you can pick up in the moment) are much more common. They are less scary, but everyone has access to them and I've met few people without experience who come up with plans for defending against those before they start thinking about knives. It's not the really scary thing that kills you. It's the minimum viable thing.

When deliberating poisons, people tend to think of the flashy, potent ones. Cyanide, Strychnine, Tetrodotoxin. Anything sufficiently scary with LDs in the low milligrams. The ones that are difficult to defend against and known first and foremost for their toxicity. On first pass this seems reasonable, but the fact that they are scary and hard to defend against means that it is very rare to encounter them. It is staggeringly more likely that you will suffer poisoning from Acetaminophen or the likes. OTC medications, cleaning products, batteries, pesticides, supplements. Poisons which are weak enough to be common. It's not the really scary thing that kills you. It's the minimum viable thing.

My impression is that people in AI safety circles follow a similar pattern of directing most of their attention at the very competent, very scary parts of risk-space, rather than the large parts. Unless I am missing something, it feels pretty clear that the majority of doom-worlds are ones in which we die stupidly. Not by the deft hands of some superintelligent optimizer tiling the universe with its will, but the clumsy ones of a process that is powerful enough to kill a significant chunk of humanity but not smart enough to do anything impressive after that point. Not a schemer but an unstable idiot placed a little too close to a very spooky button by other unstable idiots.

Killing enough of humanity that the rest will die soon after isn't that hard. We are very very fragile. Of course the sorts of scenarios which kill everyone immediately are less likely in worlds where there isn't competent, directed effort, but the post-apocalypse is a dangerous place and the odds that the people equipped to rebuild civilisation will be among the survivors, find themselves around the means to do so, make a few more lucky rolls on location and keep that spark going down a number of generations are low. Nowhere near zero but low. In bits of branch-space in which it is technically possible to bounce back given some factors, lots of timelines get shredded. You don't need a lot of general intelligence to design a bio-weapon or cause the leak of one. With militaries increasingly happy to hand weapons to black-boxes, you don't need to be very clever to start a nuclear incident. The meme which makes humanity destroy itself too might be relatively simple. In most worlds, before you get competent maximizers with the kind of goal content integrity, embedded agency and all the rest to kill humanity deliberately, keep the lights on afterwards and have a plan for what to do next, you get a truly baffling number of flailing idiots next to powerful buttons, or things with some but not all of the relevant capabilities in place – competent within the current paradigm but with a world-model that breaks down in the anomalous environments it creates. Consider the humble rock.

Another way of motivating this intuition is great-filter flavoured. Not only do we not see particularly many alien civs whizzing around, we also don't see particularly many of the star-eating Super-Ints that might have killed them. AI as a great filter makes more sense if most of the failure modes are stupid – if the demon kills itself along with those who summoned it.

This is merely an argument for a recalibration of beliefs, not necessarily an argument that you should change something about your policies. In fact there are some highly compelling arguments for why the assumption that we're likely to die stupidly shouldn't actually matter for the way you proceed in some relevant ways.

One of them is that the calculus doesn't work. That 1/100 odds of an unaligned maximizer are significantly worse than 1/10 odds of a stupid apocalypse because the stupid apocalypse only kills humanity. The competent maximizer kills the universe. This is an entirely fair point, but I'd like you to make sure that this is actually the calculus you're running rather than a mere rationalization of pre-existing beliefs.

The second is that the calculus is irrelevant because most people in AI-safety positions have much more sway on levers that lead to competent maximizers than they do on levers which lead to idiots trusting idiots with doomsday-tech. There is a Garrabrantian notion that most of your caring should be tangled up with outcomes that are significantly causally downstream from you, so while one of those risks is greater, you have a comparative advantage on minimizing the smaller one, which outweighs the difference. This too might very well be true and I'd merely ask you to check if it's the real source of your beliefs or whether you are unduly worried about the scarier thing because it is scary. Due to a narrativistic thinking where the story doesn't end in bathos. Where the threat is powerful. Where you don't just get hit over the head with a rock.

It might in this specific case be dignified to put all your effort into preparing for knife fights, but I think your calibration is off if you think that those aren't a small subset of worlds in which we die. It's not the really scary thing that kills you. It's the minimum viable thing.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI安全 超级智能 笨蛋 大过滤器 风险
相关文章