少点错误 2024年10月07日
Compelling Villains and Coherent Values
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了道德一致性与权衡之间的关系,以“黑暗骑士”中的蝙蝠侠和小丑为例,分析了两种不同的道德观:一种是坚定不移的道德准则,另一种则是混乱无序的道德观。作者认为,道德一致性与权衡之间存在着一种微妙的平衡,需要在追求个人信念的同时,也要考虑现实世界的复杂性和不确定性。文章还探讨了理性主义者在道德一致性方面的特点,以及如何将理性思维与道德选择相结合。

🤔 **道德一致性与权衡:** 文章以“黑暗骑士”中的蝙蝠侠和“小丑”为例,探讨了道德一致性和权衡之间的关系。蝙蝠侠代表了坚定不移的道德准则,他始终坚守自己的原则,即使面对极端情况也不妥协。而小丑则代表了混乱无序的道德观,他缺乏明确的价值观和目标,以制造混乱为乐。作者认为,道德一致性与权衡之间存在着一种微妙的平衡,需要在追求个人信念的同时,也要考虑现实世界的复杂性和不确定性。

🤖 **理性主义者与道德一致性:** 作者认为理性主义者在道德一致性方面可能存在一定的局限性。由于理性主义者倾向于逻辑推理和分析,他们可能难以面对道德困境中难以量化的因素,例如情感、同情心等。此外,理性主义者有时会过度追求逻辑一致性,而忽略了现实世界的复杂性和不确定性。

💡 **将理性思维与道德选择相结合:** 文章最后探讨了如何将理性思维与道德选择相结合。作者认为,理性思维可以帮助我们更清晰地理解道德问题,但不能完全替代道德判断。在面对道德困境时,我们应该综合考虑理性分析和情感体验,做出符合自身价值观的决策。

💪 **个人力量与道德一致性:** 作者认为,个人力量与道德一致性之间存在着密切的联系。拥有强烈的道德信念和价值观,可以赋予个人强大的力量,帮助他们克服困难和挑战。而缺乏道德一致性的人,往往容易迷失方向,难以做出明智的选择。

🧠 **理性主义者与道德一致性:** 作者认为,理性主义者在道德一致性方面可能存在一定的局限性。由于理性主义者倾向于逻辑推理和分析,他们可能难以面对道德困境中难以量化的因素,例如情感、同情心等。此外,理性主义者有时会过度追求逻辑一致性,而忽略了现实世界的复杂性和不确定性。

Published on October 6, 2024 7:53 PM GMT

Epistemic status: This essay attempts to communicate intuitions strong shaped by @Joe Carlsmith's writing, particularly "On Green."

Warning: very minor spoilers, mostly for "The Dark Knight".

Some people have a sense of weight about them. They know exactly what they want and can explain why it is Right. Presented with a moral dilemma, they resolve conflicts between local values in a consistent way by appealing to higher values - they can tell you today whether they would actually pull the lever in the trolley problem and why they (hopefully) wouldn't harvest the organs from one patient to save five - or why they would be compelled to. Most of these people seem to be deeply religious - the real-world example I have in mind is an intelligent Catholic friend who once had his bike stolen and explained to me that it was a shame because he would have happily given it away if someone needed it. I know him well enough to say it was true. Ask him what he wants out of life and he gives a well-reasoned, specific answer justified in terms of a hierarchy of values from family and charity to (his idea of) "serving God." I think of this as moral[1] coherence. 

The paragon of this class is some sort of warrior monk defending the gates of an abbey against a horde of barbarians, and afterwards burying each of them with an hours long prayer. The opposite is perhaps the Joker, who we will come back to.

Mostly, I don't get this sense of weight from nerds or rationalists - I think it's not a natural part of nerd culture. Nerds are too clever and contrarian; we mostly can't even agree with ourselves. Also, we tend to "throw off the shackles" of religion and often conventional morality, which are both systems carefully engineered for moral coherence, though modern morality is much more cosmopolitan. However, I do get a certain sense of moral coherence off of Eliezer (Scott seems more complicated).

Some of the most compelling villains are morally coherent. They have some consistent, authentic beliefs that set them on a collision course with the protagonist and in the best cases bring out the moral conflicts within the protagonist, often by posing some kind of ultimatum (for instance Jason Bourne, who has a complicated relationship with authority and nationalism, usually fights hardcore patriots). We tend to like a little moral confusion in our protagonists; perhaps it makes them relatable. Think of the cold determination of Thanos facing down the internally divided Avengers, or Colonel Miles Quaritch mercilessly slaughtering aliens in Avatar (the movie was a bit basic but I think he was an excellent, formidable antagonist). In the worst cases, such as Voldemort[2] or Sauron, the villain is just pure evil, but in the better cases they are pure ideological extremists with their own compelling worldview. Sometimes the protagonist has two or more conflicting priorities, almost like an internal society trying to negotiate their actions, and the villain embodies one of the components and forces the protagonist to acknowledge and reconcile it.  

But I think the greatest hero / villain dynamics reverse this dynamic. The hero embodies a strict moral code and the villain is. . . Insane. The clearest well-known example is Batman versus the Joker. I am thinking particularly of their portrayals in "The Dark Knight." Batman is a Kantian who refuses to kill and who's entire being is devoted to a quest for justice/vengeance. He's super-humanly disciplined with a spartan training regimen (and therefore jacked, which seems to be a common sign of moral coherence in both fiction and reality). He doesn't seem to spend any time struggling with moral dilemmas[3], though his actions often seem morally ambiguous to others. The Joker, on the other hand, is dangerous because of his intelligence, charisma, and raw unpredictability, but doesn't seem to have any fixed objective at all except possibly to cause chaos. He is in a sense so morally incoherent that he no longer faces internal conflict; he is just a force of nature. Their struggle is fascinating to watch not because there seems to be any chance of either party questioning themselves, but because there doesn't. It's like an unstoppable force meeting an immovable object. The suspense is mostly over who's going to end up inside the blast radius[4].

Think of the Dark Tower's gunslinger Roland Deschain versus the man in black Walter O'Dim. Roland is consistently characterized as simple minded, not particularly intelligent, driven by one overriding goal. Walter is clever and full of secrets, but he's his own worst enemy everywhere he appears in Stephen King's universes. My reading is that behind the curtain even he doesn't really know what his schemes are for.

Similarly, Lucifer is often portrayed as a paragon of moral incoherence for its own sake, almost constructing a kind of meta-coherence but still constantly undercutting itself with overly clever schemes. 

I think it's worth noting here that coherent morals are quite separable from coherent beliefs. Decision theorists have worked out a lot of the math of coherent belief: there is a thorough argument that any beliefs satisfying the Cox axioms for consistency can be represented and updated according to probability theory. This doesn't tell us how to correct our incoherent beliefs, but in practice I think that Bayesian updating tends to automatically smooth out inconsistencies[5]. On the other hand, we can represent coherent preferences as utility functions according to the von Neumann and Morgenstern criteria for rationality. But humans don't have coherent preferences, and we don't seem to have a mathematical description of how to correct this. This may be required for a solution to the alignment problem. For instance, corrigibility is essentially (meant to be) a choice of algorithm for resolving incoherent preferences. However, the situation seems to be less convenient for "cohering" preferences[6].

I wonder if rationalists, having basically worked out coherence of belief but not of action, are particularly prone to power-seeking. When you're not sure what you want but you feel exceptionally competent it makes sense to increase your optionality so you can get whatever you happen to want tomorrow. This may seem counter-intuitive sense a person with very coherent values should pursue power instrumentally. But when you have a specific goal there are often more direct ways to obtain it than first power-seeking, such as various forms of self-sacrifice. For instance, you might serve as a soldier to protect your country instead of forming your own mercenary band (= law firm, in the modern day), donate your kidney instead of training for a marathon, or volunteer at a food bank instead of grinding out leetcode problems after work. A reliable gear in a larger machine might be less agentic but more useful than a scheming Machiavellian. Goals like solving the alignment problem are unusually strong justifications for cultivating personal excellence, but there are a lot of direct ways to save lives that just aren't as glorious. I am not convinced that EA-style earning to give always or even usually works out to higher expected value for most people, since I tend to think that doing good now compounds faster than career capital.  

Also, it is interesting to consider whether there are deep reasons that coherent belief seems to be anti-correlated with coherent morals. For instance, internal divisions are the opposite of moral coherence, but should increase self-doubt, which is useful for reasoning under uncertainty. A Bayesian cultivates lightness, but a warrior monk has weight. Can these two opposing and perhaps contradictory natures be united to create some kind of unstoppable Kwisatz Haderach? Perhaps romantically, this is how I like to imagine Miyamoto Musashi, but I don't know of anyone fit to inherit this mantle today. 

  1. ^

    Substitute "utility" for "moral" if you like - this intellectualization of vocabulary also took place historically in decision theory.

  2. ^

    Various supporting villains are more complex than the Dark Lord. Lucius Malfoy is a true believer who feels a little more realistic. 

  3. ^

    For that, you would need to see either the former or later movie in the trilogy - I watched TDK first (out of order) which probably colored my impression of this Batman significantly. 

  4. ^

    The Doctor and the Master have a similar relationship. Though, on second thought, perhaps there is just a hint of dread that even our most morally coherent heroes will stare into the void of a mad nemesis's broken mind and be changed - and that if even they can be corrupted, maybe there is no fundamental difference between good and evil.  

  5. ^

    See Blackwell and Dubin's results on merging of opinions. Technically this only explains how people with different priors that still obey probability theory can come to agreement. But at least one doesn't need to get the max entropy calculations exactly right - some sorts of errors can be smoothed out by Bayesian updating.

  6. ^

    This is my intuition because the distinction seems similar to the choice of UTM for the universal distribution in belief and action. For pure prediction it only matters up to a constant, but in history-based reinforcement learning choosing the wrong UTM ruins convergence guarantees. 



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

道德一致性 权衡 理性主义 道德困境 蝙蝠侠 小丑
相关文章