少点错误 2024年11月14日
Heresies in the Shadow of the Sequences
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了人工智能领域的一些“异端”观点,例如LLM的局限性、非因果决策理论在AGI设计中的作用、AIXI的缺陷修复以及人类效用函数的概念。作者认为,当前LLM发展似乎停滞,我们应该重新思考AGI的出现时间,并警惕LLM滥用带来的负面影响。此外,作者指出非因果决策理论并非AGI设计的必要条件,并认为AIXI的缺陷可以通过一些简单的方法修复。最后,作者强调人类没有效用函数,但一些更宏大的道德项目,如宗教、启蒙运动等,可以被视为拥有效用函数,这或许比追求个人效用更重要。

🤔 **LLM发展停滞,AGI出现时间存在不确定性:**作者建议我们重新审视Scott Aaronson的观点,以对数尺度来评估AGI出现的可能性,认为AGI在短期内出现的可能性较低,而长期内(例如10年)出现的可能性较高。

🚫 **警惕LLM滥用,避免降低思考能力:**作者认为过度依赖LLM写作会降低写作能力,并可能引入错误,甚至产生负面影响,因此建议减少使用LLM进行写作。

💡 **非因果决策理论并非AGI设计的必要条件:**通过将一个CDT智能体与另一个机器隔离开来,可以实现目标智能体的构建,而无需自修改或与克隆体进行博弈。

🔄 **AIXI缺陷可修复,但智能体结构可能难以理解:**作者认为AIXI的哲学缺陷可以通过一些方法修复,但即使是最智能的智能体,其认知结构也可能难以理解,其源代码可能与白噪声无异。

🌍 **人类没有效用函数,但道德项目可以被视为拥有效用函数:**作者认为人类的认知能力有限,无法构建一致的效用函数,但一些更宏大的道德项目,如宗教、启蒙运动等,可以被视为拥有效用函数,这或许是思考人类目标和人工智能对齐问题的更佳视角。

Published on November 14, 2024 5:01 AM GMT

Religions are collections of cherished but mistaken principles. So anything that can be described either literally or metaphorically as a religion will have valuable unexplored ideas in its shadow.

-Paul Graham

This post isn't intended to construct full arguments for any of my "heresies" - I am hoping that you may not have considered them at all yet, but some will seem obvious once written down. If not, I'd be happy to do a Dialogue or place a (non-or-small-monetary) bet on any of these, if properly formalizable.

    Now that LLM's appear to be stalling, we should return to Scott Aaronson's previous position and reason about our timeline uncertainty on a log scale. A.G.I. arriving in ~1 month very unlikely, ~1 year unlikely, ~10 years likely, ~100 years unlikely, ~1000 years very unlikely.Stop using LLM's to write. It burns the commons by filling allowing you to share takes  on topics you don't care enough to write about yourself, while also introducing insidious (and perhaps eventually malign) errors. Also it's probably making you dumber (source is speculative I don't have hard data).Non-causal decision theories are not necessary for A.G.I. design. A CDT agent in a box (say machine 1) can be forced to build whatever agent it expects to perform best by writing to a computer in a different box (say machine 2), before being summarily deleted. No self modification is necessary and no one needs to worry about playing games with their clone (except possibly the new agent in machine 2, who will be perfectly capable of using some decision theory that effectively pursues the goals of the old deleted agent). It's possible that exotic decision theories are still an important ingredient in alignment, but I see strong no reasons to expect this.All supposed philosophical defects of AIXI can be fixed for all practical purposes through relatively intuitive patches, extensions, and elaborations that remain in the spirit of the model. Direct AIXI approximations will still fail in practice, but only because of compute limitations which are even possible to brute force with slightly clever algorithms and planetary scale compute, but in practice this approach will lose to less faithful approximations (and unprincipled heuristics). But this is an unfair criticism, because-Though there are elegant and still practical specifications for intelligent behavior, the most intelligent agent that runs on some fixed hardware has completely unintelligible cognitive structures and in fact its source code is indistinguishable from white noise. This is why deep learning algorithms are simple but trained models are terrifyingly complex. Also, mechanistic interpretability is a doomed research program.The idea of a human "more effectively pursuing their utility function" is not coherent because humans don't have utility functions - our bounded cognition means that none of us has been able to construct consistent preferences over large futures that we would actually endorse if our intelligence scaled up. However, there do exist fairly coherent moral projects such as religions, the enlightenment, the ideals of Western democracy, and other ideologies along with their associated congregations, universities, nations, and other groups, of which individuals make up a part. These larger entities can be better thought of as having coherent utility functions. It is perhaps more correct to ask "what moral project do I wish to serve" than "what is my utility function?" We do not have the concepts to discuss what "correct" means in the previous sentence, which may be tied up with our inability to solve the alignment problem (in particular, our inability to design a corrigible agent).


Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 AGI LLM 效用函数 道德项目
相关文章