少点错误 9分钟前
A Timing Problem for Instrumental Convergence
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文与Helena Ward和Jen Semler合著,探讨超级智能AI的潜在风险,指出工具理性并不要求目标保持稳定,并分析了由此产生的时序问题对AI安全的启示。

Published on July 30, 2025 7:15 PM GMT

This paper of mine ("A Timing Problem for Instrumental Convergence"), co-authored with Helena Ward and Jen Semler, was recently accepted in Philosophical Studies for a superintelligent robots issue (open access). The paper argues that instrumental rationality doesn't require goal preservation/goal-content integrity/goal stability. Here is the abstract:

Those who worry about a superintelligent AI destroying humanity often appeal to the instrumental convergence thesis—the claim that even if we don’t know what a superintelligence’s ultimate goals will be, we can expect it to pursue various instrumental goals which are useful for achieving most ends. In this paper, we argue that one of these proposed goals is mistaken. We argue that instrumental goal preservation—the claim that a rational agent will tend to preserve its goals because that makes it better at achieving its goals—is false on the basis of the timing problem: an agent which abandons or otherwise changes its goal does not thereby fail to take a required means for achieving a goal it has. Our argument draws on the distinction between means-rationality (adopting suitable means to achieve an end) and ends-rationality (choosing one’s ends based on reasons). Because proponents of the instrumental convergence thesis are concerned with means-rationality, we argue, they cannot avoid the timing problem. After defending our argument against several objections, we conclude by considering the implications our argument has for the rest of the instrumental convergence thesis and for AI safety more generally.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

超级智能AI 工具理性 目标稳定性 AI安全 时序问题
相关文章