Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?

少点错误 2024年07月03日

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

本文探讨了训练一个亚通用人工智能 (sub-AGI) 来解决人工智能对齐问题的可能性。作者认为，当前的人工智能在能力范围内已经实现了对齐，而博士级别的研究人员最终能够解决人工智能对齐问题。作者假设，在达到博士级别智能之前，当前的人工智能在现有技术范式下不会出现失控。因此，我们可以训练人工智能直到它达到博士级别智能，然后让它解决人工智能对齐问题，而无需它进行自我提升。

🤔 **当前人工智能对齐现状:** 作者认为，目前的人工智能在能力范围内已经实现了对齐，不会出现故意伤害人类的情况。

🧠 **博士级别智能与对齐问题:** 作者假设，博士级别的研究人员最终能够解决人工智能对齐问题，并且博士级别的智能水平低于通用人工智能 (AGI)。

🤖 **训练亚通用人工智能解决对齐问题:** 作者提出，我们可以训练人工智能直到它达到博士级别智能，然后让它解决人工智能对齐问题，而无需它进行自我提升。

❓ **人工智能失控的智能水平:** 作者认为，目前尚不清楚人工智能在哪个智能水平会出现失控，这是他最不确定的部分。

💡 **亚通用人工智能的优势:** 作者认为，训练一个亚通用人工智能来解决人工智能对齐问题，可以避免人工智能自我提升带来的风险，并利用其强大的能力解决对齐问题。

Published on July 2, 2024 8:13 PM GMT

I believe there are people with far greater knowledge than me that can point out where I am wrong. Cause I do believe my reasoning is wrong, but I can not see why it would be highly unfeasible to train a sub-AGI intelligent AI that most likely will be aligned and able to solve AI alignment.

My assumptions are as follows:

Current AI seems aligned to the best of its ability.PhD level researchers would eventually solve AI alignment if given enough time.PhD level intelligence is below AGI in intelligence.There is no clear reason why current AI using current paradigm technology would become unaligned before reaching PhD level intelligence.We could train AI until it reaches PhD level intelligence, and then let it solve AI Alignment, without itself needing to self improve.

The point I am least confident in, is 4, since we have no clear way of knowing at what intelligence level an AI model would become unaligned.

Attached is my mental model of what intelligence different tasks require, and different people have.

Figure 1: My mental model of natural research capability RC (basically IQ with higher correlation for research capabilities), where intelligence needed to align AI is above average PhD level, but below smartest human in the world, and even further from AGI.

Discuss

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签