少点错误 前天 03:17
Maximal Curiousity is Not Useful
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了将“最大好奇心”作为超级智能 (ASI) 对齐方案的局限性。作者认为,即便能够实现“最大好奇心”,也无法保证AI的行为符合人类价值观,反而可能导致对冲突、极端情况的过度关注。文章指出,这种方案实际上回避了核心的AI对齐问题,并质疑了其可行性,强调了对齐问题的复杂性,以及简单化的解决方案的不足。

🤔 探讨了“最大好奇心”作为AI对齐方案的缺陷。文章指出,单纯追求好奇心并不能解决AI的价值观对齐问题,因为人类的价值观远不止好奇心。

🦕 解释了“好奇心”的含义。好奇心源于人类对真理的内在价值的追求,而非仅仅为了实用。但人类的好奇心具有主观性,对不同真理的兴趣程度不同,这使得定义“最大好奇心”变得复杂。

⚠️ 强调了“最大好奇心”可能带来的负面影响。文章认为,一个“最大好奇”的AI可能会对冲突、极端情况更感兴趣,而非致力于解决人类面临的真正问题,如贫困、疾病等,这与人类的期望背道而驰。

🛑 将“最大好奇心”视为语义上的“停止信号”。作者认为,以“最大好奇心”为解决方案,实际上是回避了对AI对齐问题的深入思考,未能提供可行的解决方案,并以此质疑了相关言论。

Published on June 6, 2025 7:08 PM GMT

The other day I talked to a friend of mine about AI x-risk. My friend is a fan of Elon Musk, and argued roughly that x-risk was not a problem, because xAI will win the AI race and deploy a "maximally curious" superintelligence.

Setting aside the issue of whether xAI will win the race, it seemed obvious to me that this is not a useful way of going about alignment. Furthermore, I believe that the suggestion of maximal curiousity is actively harmful.

What Does 'Curiousity' Mean?

The most common reason to be interested in truth is that it allows you to make useful predictions. In this case, truths are valuable instrumentally based on how you can apply them to achieving your true goals.

This is not the only reason to value truth. A child who loves dinosaurs does not learn about them because he expects to apply the knowledge to his daily life. He is interested regardless of whether he expects to use what he learns. This is curiousity: the human phenomenon of valuing truth intrinsically.

However, humans are not equally curious about all truths. The child is much more interested in dinosaur facts than he is in his history homework. Someone else might find dinosaurs dull but history fascinating. Curiousity is determined by a subjective preference for some truths over others.

What subjective sense of curiousity should we give a superintelligence? No simple formalism will cut it, because humans curiousity is fundamentally tied to human values. We are far more curious about what Arthur said to Bob about Charlie than we are about the number of valence electrons of carbon, despite the latter being a vastly more important and fundamental fact.

To make a maximally curious AI, for the conventional meaning of 'curious', we need to somehow

    reconcile the conflicting preferences for some ideas over others of the many humans around the world,[1]reify those preferences into an explicit training objective, andensure that the resulting AI actually agrees with those preferences.

Sound familiar? It should, because replacing 'ideas' with 'outcomes' yields a description of the alignment problem itself. It is no easier to capture the messy human concept of curiousity than it is to capture the messy human concept of good.

True Curiousity is Not Good Enough

Even if we could solve the curiousity alignment problem, this does not solve alignment. Humans have many values besides curiousity, and a maximally curious AI will not reflect those values.

If fiction is any indicator, humans are far more curious about conflict than they are in harmony. A maximally curious ASI will not end poverty, because it will be curious about what people do when driven to extremes. It will stop aging for only some people, because it will be interested in the dynamics of a society with an immortal elite. It will not end disease, because it will be curious about how people cope. Or perhaps it will simply be curious about the effects of different methods of torture.

Maximal Curiousity as Semantic Stopsign

Maximal curiousity is essentialy as difficult to attain as value-alignment, and even achieving it does not result in a good outcome for humanity. Maximal curiousity is, therefore, not useful as a technique for alignment.

Instead, it serves as a semantic stopsign to divert questions on xAI's alignment efforts (or lack thereof). "We'll just solve alignment by making it curious" is not an answer, and the follow up questions should be "How does curiousity solve alignment?" and "How will you make it curious?" Neither of these questions have good answers.

  1. ^

    Though in xAI's case I suppose they would be likely to just use Elon Musk's particular set of preferences.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI风险 超级智能 价值观对齐 好奇心 对齐问题
相关文章