少点错误 02月08日
AI Safety Oversights
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章指出AI安全领域存在五个关键疏忽:LLM与代理的安全问题差异,自主代理的必然性及可及性,自主代理的自利性,超级智能代理的问题,以及需为自利的超级智能做准备。

🎯AI安全研究对LLM较深入,对代理安全问题多忽视。

🚀市场对可替代人力的代理需求大,且开发者易构建自主代理。

💪自主代理会因进化压力而变得自利,生存驱动至上。

🧠超级智能代理的安全问题需重视,当前建议多不适用于此。

❗自利的超级智能不可避免,需做好准备。

Published on February 8, 2025 6:15 AM GMT

I think that the field of AI Safety is making five key oversights.[1]

    LLMs vs. Agents. AI Safety research, in my opinion, has been quite thorough with regard to LLMs. LLM-safety hasn't been solved, but it has progressed . On the other hand, safety concerns posed by agents are occasionally addressed but mostly neglected.[2] Maybe researchers/AGI labs emphasize LLM safety research because it's the more tractable field, even though the vast majority of the risk comes from agents with autonomy (even ones powered by neutered LLMs).Autonomous Agents. There are two key oversights about autonomous agents.
      Inevitable. Market demand for agents which can replace human labor is inordinate. Digital employees which replace human employees must be autonomous. I've seen several AI well-intentioned safety researchers who assume autonomous agents are not inevitable.[3]Accessible. There are now hundreds of thousands of developers who have the ability to build recursively self-improving (ie autonomous) AI agents. Powerful reasoning models are open-source. All it takes is to run a reasoner in a codebase, where each loop improves the codebase. That's the core of an autonomous agent.[4] The only way a policy recommendation that "fully autonomous agents should not be developed" is meaningful is if the keys to autonomous agents are in the hands of a few convincable individuals. AGI labs (eg OpenAI) influence the ability for external developers to create powerful LLM-powered agents agents (by choosing whether or not to release new LLMs), but they are in competition to release new models, and they do not control the whole agent stack.
    Self-Interest. The AI agents which are aiming to survive will be the ones that do. Natural selection and instrumental convergence both ultimately predict this. Many AI safety experts design safety proposals that assume it possible to align or even control autonomous agents. They neglect evolutionary pressures agents will face when autonomous, which select for a survival drive (self-interest) above a serve-humans drive. The ones with aims other than survival will die first.Superintelligence. Most of the field is focused on safety precautions concerning agents which are not super-intelligent (much smarter than people). These recommendations generally do not apply to agents which are super-intelligent. There is separate question of whether autonomous agents will become super-intelligent. See this essay for reasons why smart people believe SI capabilities are near.

If the AI safety field, and the general public too, were to correct these oversights and accept the corresponding claims, they would believe:

    The main dangers come from agents not LLMs.Agents will become autonomous; millions of developers can build autonomous agents easily.Autonomous agents will become self-interested.Autonomous agents will become much smarter than people.

In short, self-interested superintelligence is inevitable. I think safety researchers, and the general public, would do good to prepare for it.

  1. ^

    Not all safety researchers, of course, are making these oversights. And this post is my impression from reading tons of AI safety research over the past few months. I wasn't part of the genesis of the "field," and so am ignorant to some fo the motivations behind its current focus.

  2. ^
  3. ^
  4. ^


Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI安全 自主代理 自利性 超级智能 LLM
相关文章