少点错误 2024年09月05日
What program structures enable efficient induction?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨在Solomonoff诱导框架下,如何通过排除错误假设加速未来观察的归纳过程,以及相关学习方法和其对对齐问题的意义。

🎯在Solomonoff诱导框架中,接收新观察可排除大量错误假设,使剩余假设范围缩小,从而加快对新情况的适应。

📈增量修改:修改和扩充当前程序比从头寻找新程序更高效,且更符合动物和人类实际更新知识的方式,需探讨如何构建允许增量修改现有假设的程序结构。

📦模块化:模块化程序结构可分解为松散耦合组件,遇到预测错误时只需修改小部分程序以适应新观察,有助于高效学习。

📋压缩:将Solomonoff诱导视为按位串枚举程序,通过压缩编码跳过已被否定的假设,但学习编码会引发新问题,可寻找可推广的近似编码以获益处。

🔄闭环:Solomonoff诱导可视为对观察空间的压缩,近似压缩编码本质是对程序空间的压缩,可通过递归近似编码实现各级元学习和学习抽象元模式。

Published on September 5, 2024 10:12 AM GMT

previously: My decomposition of the alignment problem

A simple model of meta/continual learning

In the framework of solomonoff induction, we observe an infinite stream of bitstring and we try to predict the next bit by finding the shortest hypothesis which reproduces our observations (some caveats here).  When we receive an additional bit of observation, in principle, we can rule out an infinite number of hypotheses (namely all programs which didn't predict our observation) which creates an opportunity to speedup our induction process for future observations. Specifically, as we try to find the next shortest program which predicts our next bit of observation, we can learn to skip over the programs that have already been falsified by our past observations. The process of "learning how to skip over falsified programs" takes time and computational costs upfront, but it can yield dividends of computational efficiency for future induction.

This is my mental model for how agents can "learn how to learn efficiently": An agent who has received more observations can usually adapt to new situations quicker because more incorrect hypotheses can be ruled out already, which means there's a narrower set of remaining hypotheses to choose from. 

More generally,  an important question to ask is given that the underlying space of remaining hypotheses is constantly shrinking as we receive new observations, what sorts of data structures for representing hypothesis should we use to exploit that? How should we represent programs if we don't just want to execute them, but also potentially modify them into other plausible hypothesis? If a world model is selected based on its ability to quickly adapt to new environments, what is the type signature of that world model?

Quick thoughts

Why this might be relevant for alignment

Transformative AI will often need to modify their ontologies in order to accomodate new observations, which means that if we want to translate our preferences over real world objects to the AI's world model, we need to be able to stably "point" to real world objects despite ontology shifts. If efficient learning relies on specific data structures for representing hypotheses, these structures may reveal properties that remain invariant under ontology shifts. By identifying these invariant properties, we can potentially create robust ways to maintain our preferences within the AI's evolving world model. 

Furthermore, insofar as humans utilize a similar data structure to represent their world models, this could provide insights into how our actual preferences remain consistent despite ontology shifts, offering a potential blueprint for replicating this process in AI.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Solomonoff诱导 高效学习 对齐问题 数据结构 元学习
相关文章