少点错误 01月30日
My Mental Model of AI Optimist Opinions
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章表达了对AI乐观主义者观点的反驳,认为AI不会很快全面超越人类,实际AI进展是优化预训练过程,且我们能控制AI。还提到AI风险担忧的危害,以及对一些理论的批判。

💡OpenAI引领的AI发展不会很快超越人类

🎯实际AI进步在于优化预训练过程及与人类文化融合

🧐我们能够通过示例和反馈控制AI

🚫AI风险担忧会带来危险,如促使社会使用暴力监管

Published on January 29, 2025 6:44 PM GMT

Epistemic status: The text below is a sort of strawman of AI optimists, where I took my mental model for how I disagree with rationalist AI optimists and cranked it up to 11. I personally disagree with every sentence below, and I'm posting it here because I'm interested in whether AI optimists have any major corrections they want to make in the comments. Of course I understand that everyone has their own unique opinion and so I would expect every AI optimist to at least disagree with some parts of it too.

The rapid progress spearheaded by OpenAI is clearly leading to artificial intelligence that will soon surpass humanity in every way. People used to be worried about existential risk from misalignment, yet we have a good idea about what influence current AIs are having on the world, and it is basically going fine.

The root problem is that The Sequences expected AGI to develop agency largely without human help; meanwhile actual AI progress occurs by optimizing the scaling efficiency of a pretraining process that is mostly focus on integrating the AI with human culture. This means we will be able to control AI by just asking it to do good things, showing it some examples and giving it some ranked feedback.

You might think this is changing with inference-time scaling, yet if the alignment would fall apart as new methods get taken into use, we'd have seen signs of it with o1. In the unlikely case that our current safety will turn out to be insufficient, interpretability research has worked out lots of deeply promising ways to improve, with sparse autoencoders letting us read the minds of the neural networks and thereby screen them for malice, and activation steering letting us deeply control the networks to our hearts content.

AI x-risk worries aren't just a waste of time, though; they are dangerous because they make people think society needs to make use of violence to regulate what kinds of AIs people can make and how they can use them. This danger was visible from the very beginning, as alignment theorists thought one could (and should) make a singleton that would achieve absolute power (by violently threatening humanity, no doubt), rather than always letting AIs be pure servants of humanity.

To "justify" such violence, theorists make up all sorts of elaborate unfalsifiable and unjustifiable stories about how AIs are going to deceive and eventually kill humanity, yet the initial deceptions by base models were toothless, and thanks to modern alignment methods, serious hostility or deception has been thoroughly stamped out.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 AI发展 AI控制 AI风险
相关文章