少点错误 2024年09月14日
Avoiding the Bog of Moral Hazard for AI
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了AI发展中的道德问题,指出在面对非人类思维的实体时,我们存在诸多道德困惑,如不确定其何时具有道德主体地位,缺乏对其互动模式的道德判断等,强调需谨慎发展AI以避开道德风险。

🎯AI发展中存在道德风险,我们对这些非人类思维的实体了解不足,不清楚它们何时会具有如人类般的感知、意识等,从而难以确定其道德主体地位。

💡在AI发展过程中,我们缺乏对与这些实体互动模式的道德判断。它们的运作规则与生物不同,我们对诸如复制、合并、暂停和恢复等操作的道德直觉并不清晰。

🚧为避免道德风险,我们应谨慎发展AI。要么创造具有人类水平道德主体地位的完全道德相关实体并平等对待,要么刻意不制造会遭受痛苦的智能生物,目前应专注于制造无痛苦能力的工具型AI。

Published on September 13, 2024 9:24 PM GMT

Imagine if you will, a map of a landscape. On this map, I will draw some vague regions. Their boundaries are uncertain, for it is a new and under-explored land. This map is drawn as a graph, but I want to emphasize that the regions are vague guesses, and the true borders could be very convoluted.

 

 

So here's the problem. We're making these digital minds, these entities which are clearly not human and process the world in different ways from human minds. As we improve them, we wander further and further into this murky fog covered bog of moral hazard. We don't know when these entities will become sapient / conscious / valenced / etc to such a degree that they have moral patient-hood. We don't have a good idea of what patterns of interaction with these entities would be moral vs immoral. They operate by different rules than biological beings. Copying, merging, pausing and resuming, inference by checkpoints with frozen weights... We don't have good moral intuitions for these things because they differ so much from biological minds.

 

Once we're all in agreement that we are working with an entity on the right hand side of the chart, and we act accordingly as a society, then we are clear of the fog. Many mysteries remain, but we know we aren't undervaluing the beings we are interacting with.

While we are very clearly on the left hand side of the chart, we are also fine. These are entities without the capacity for human-like suffering, who don't have significant moral valence according to most human ethical philosophies.

 

Are you confident you know where to place Claude Opus 3 or Claude Sonnet 3.5 on this chart? If you are confident, I encourage you to take a moment to think carefully about this. I don't think we have enough understanding of the internals of these models to be confident.

My uncertain guess would place them in the Bog of Moral Hazard, but close to the left hand side. In other words, probably not yet moral patients but close to the region where they might become such. I think that we just aren't going to be able to clear up the murk surrounding the Bog of Moral Hazard anytime soon. I think we need to be very careful as we proceed with developing AI to deliberately steer clear of the Bog. Either we make a fully morally relevant entity, with human-level moral patient-hood and treat it as equal to humans, or we deliberately don't make intelligent beings who can suffer.

Since there would be enormous risks in creating a human-level mind in terms of disruption to society and risks of catastrophic harms, I would argue that humanity isn't ready to make a try for the right hand side of the chart yet. I argue that we should, for now, stick to deliberately making tool-AI who don't have the capacity to suffer.

Even if you fully intended to treat your digital entity with human-level moral importance, it still wouldn't be ok to do. We first need philosophy, laws, and enforcement which can determine things like:

"Should a human-like digital being be allowed to make copies of itself? Or to make merge-children with other digital beings? How about inactive backups with triggers to wake them up upon loss of the main copy? How sure must we be that the triggers won't fire by accident?"

 

"Should a human-like digital being be allowed to modify it's parameters and architecture, to attempt to self-improve? Must it be completely frozen, or is online-learning acceptable? What should we do about the question of checkpoints needed for rollbacks, since those are essentially clones?"

 

"Should we restrict the entity to staying within computer systems where these laws can be enforced? If not, what do we do about an entity which moves onto a computer system over which we don't have enforcement power, such as in a custom satellite or stealthy submarine?"

 

I am writing this post because I am curious what others' thoughts on this are. I want to hear from people who have different intuitions around this issue.

 

This is discussed on the Cognitive Revolution Podcast by Nathan Labenz in these recent episodes:

https://www.cognitiverevolution.ai/ai-consciousness-exploring-the-possibility-with-prof-eric-schwitzgebel/

https://www.cognitiverevolution.ai/empathy-for-ais-reframing-alignment-with-robopsychologist-yeshua-god/



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI发展 道德困境 道德风险 谨慎发展
相关文章