少点错误 07月21日 13:57
Just Make a New Rule!
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了社会规则的重要性,强调了清晰规则在协调人际关系、保障个人自由与尊严方面的作用。作者认为,尽管存在“未被禁止的行为”的理论担忧,但多数情况下,明确的规则足以引导人们的行为,无需过度赋予领导者干预个体细节的权力。文章通过类比AI领域和生活实例,阐述了规则的实际功能并非完美契合社会价值观,而是为了在个体自由与他人自由之间找到平衡点,并指出规则的制定和完善是一个持续迭代的过程,而非与“敌对智能”的永恒斗争。最终,作者呼吁人们回归对普通人的理解,通过不断完善规则来解决社会问题,而非诉诸极端的权力控制。

✅ 规则是社会运行的关键技术,能够清晰界定允许和禁止的行为,使人们在自由和尊严中生活,避免了为满足遥远权威的任意要求而扭曲自身。清晰的规则为个体提供了行为的明确指引,减少了不确定性,从而保障了社会秩序和个人自由。

💡 针对“聪明对手可能规避现有规则”的担忧,作者认为此担忧夸大了问题的复杂性,并指出社会规则的目的与AI的效用函数不同。社会规则并非要完美体现社会价值观,而是旨在确保人们能够自由、有尊严地生活,其核心在于平衡个体自由与他人自由之间的冲突,而非与“敌对智能”进行博弈。

🚗 交通法规等规则的制定是为了确保安全等基本条件,防止因个体行为的随意性而产生的危险,从而保障了人们自由、有尊严地生活。当规则存在且被遵守时,人们可以专注于创造价值,而非浪费精力处理因冲突产生的负面后果。

🚫 铅漆禁令等实例表明,当明确的规则被制定后,相关方(如油漆制造商)会自然遵守,停止生产含有有害物质的产品,因为他们并非“环境铅最大化者”。这证明了规则在引导行为、实现社会目标方面的有效性,无需赋予领导者对个体行为细节的过度干预权。

🤔 作者认为,认为规则不可行而倾向于赋予领导者干预个体细节权力的人,可能过度沉浸于科幻想象,而忽略了现实社会中人与人之间的基本联系。社会中的其他人并非“敌对智能”,而是拥有不同偏好但同样是普通人。若存在问题,通过制定新规则即可解决,而非依赖于极端的权力集中。

Published on July 21, 2025 5:54 AM GMT

"Rules" are a critical social technology for helping people live and work together in peace. From the laws passed by legislatures to govern a whole nation, to the bylaws of a neighborhood homeowner association, to the informal household rules of a single family, explicit rules make it clear to everyone what behavior is required and what behavior is forbidden, without otherwise controling every minute detail of everyone's behavior.

When there are clear rules, people don't have to drive themselves crazy contorting themselves into unnatural shapes to satisfy the whims of some distant Authority. All you have to do is make sure to obey the rules. With that taken care of, you can go about living your life the way you see fit, in freedom and dignity. As can be attested in the annals of human experience from the time of Hammurabi into the present day, it mostly works pretty great—at least compared to the alternatives. In summary, rules are good. It's good to have clear rules, and for people to obey the rules.

Normal people understand this pretty well and probably don't need to read a blog post about it, but some people who aren't normal have a theoretical objection. The space of all possible behaviors is unthinkably vast. What if the formidable intelligence of an adversary who hates everything our Society stands for, comes up with a behavior that's really bad but isn't forbidden by any of Society's rules?

The normal person is unfazed by the theoretical objection. If that happens, you could just make a new rule forbidding that behavior, right? How hard could that be?

The people who aren't normal are unimpressed with this reply. They can tell that the normal person doesn't understand the vastness of the space of possible behaviors at all. If you just make a new rule, surely the formidable intelligence of the adversary will contrive some other eldritch behavior that minimizes Society's utility function while complying to the letter of all of Society's rules. The theory of nearest unblocked strategies in the lore of AGI alignment, and the specter of specification gaming in the practice of ML engineering, make it clear that this is so. Thus, rules won't suffice; we need to empower leaders with the Authority to make judgement calls—even to control the minute details of anyone's behavior, if that's what it takes to safeguard Society's Values.

Now me, I'm normal on my mother's side, which puts me in a good position to understand what both parties to the disagreement are saying. And while my full belief-state about related topics in the theory of decision and optimization is nuanced and complex, on the narrow question of what to do about rules in human Society, I think the normal people have it basically right, and the people who aren't normal are being scared of ghosts. Let me explain.

I do not dispute the lore of AGI alignment, nor the practice of ML engineering. But crucially, the purpose of rules in human Society is highly disanalogous to the purpose of a utility or reward function in AI. Rules aren't supposed to express Society's true Values, let alone be a perfect specification robust to nearest unblocked strategies. The Values live in the hearts of Society's individual women and men, to be expressed in the way they go about living their lives the way they see fit, in freedom and dignity. The rules are just there to stop ourselves from trying to kill each other when your freedom and dignity is getting in the way of my freedom and dignity, so that we can focus on creating Value instead of wasting effort trying to kill each other.

Rules are written to ensure conditions conducive to people living their lives in freedom and dignity when those conditions wouldn't obtain in the absence of a rule. Traffic laws make it clear to everyone when it's safe to enter the road. If everyone just entered the road whenever they felt like it, that would be dangerous, and the danger would interfere with people living their lives in freedom and dignity.

The theory of nearest unblocked strategies can be relevant to rules in human Society to the extent that the conditions that a rule is intended to ensure are something that some people oppose either terminally or due to strong instrumental convergence. Income tax laws are passed so that the government will have money to fund police to enforce all the other laws, but that money has to come from somewhere and people really don't like having less money, so they put the full force of their effort and ingenuity into side-stepping the law with clever nearest unblocked strategies: underreporting cash transactions, hiding money in offshore accounts, recategorizing consumption as business expenses, &c.

But more often, the conditions that a rule is intended to ensure aren't something that people terminally or convergently-instrumentally oppose. The rule merely restricts behavior that people would otherwise engage in instrumentally, but not convergently instrumentally: if the rule is in place, they can and will avoid the behavior in order to comply with the rule.

Lead paint is an environmental hazard, so it was banned in 1978. Because of the ban, paint manufacturers stopped making lead paint. The paint manufacturers did not put the full force of their effort and ingenuity into clever nearest unblocked strategies for increasing the amount of lead in the environment, because they're not environmental lead maximizers, which aren't a real thing. The paint manufacturers just wanted to make paint. When there wasn't a rule against it, they used lead carbonate in their paint because it was convenient, but when there was a rule against it, they stopped. The rule worked—without the need for empowering an Authority to make judgement calls controlling the minute details of everyone's behavior. Why wouldn't it?

In some situations, there might be weak instrumental convergence pressures such that the first attempt at making a rule doesn't quite succeed at ensuring the conditions that the rule was meant to ensure. It turns out that, on further consideration, Society doesn't just want to avoid environmental contamination with lead in particular, but all other toxic heavy metals, too, some of which also happen to be convenient for making paint. So paint manufacturers still ended up using mercury in some paints until 1991 when that was banned, too. But once it was banned, they stopped. Why wouldn't they? They're not environmental mercury maximizers, either, which also aren't a real thing.

The work of coming up with rules to ensure socially beneficial outcomes can be frustrating, because you won't always get the rules exactly right the first time. You might need to iterate. But it's a finite and achievable amount of work, not an unwinnable unending battle against the formidable intelligence of an adversary who hates everything your Society stands for, because those mostly aren't a real thing either.

In conclusion, I think that people who think rules are unworkable and instead want to empower an Authority to make judgement calls controlling the minute details of everyone's behavior need to read less science fiction and spend more time relating to other people in their Society as people. Notwithstanding that terrifying alien superintelligences couldn't be constrained by rules because a merely human intellect lacks the capabilities to enumerate all the nearest unblocked strategies, other people in your Society are not terrifying alien superintelligences. We're just people who don't have exactly the same preferences as you. We won't always agree, but it shouldn't be this hard to live in peace with each other. If there are problems, you can just make a new rule!

(Thanks to Robert Mushkatblat and Ben Pace.)



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

社会规则 自由与秩序 人际和谐 行为规范
相关文章