少点错误 2024年10月12日
Does 5-and-10 [/logical counterfactuals] not just reduce to the base case of Tiling Agents?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了平铺代理与5和10问题。指出平铺代理在逻辑上比其后继代理更强大,以解决某些验证问题。还讨论了5和10问题,认为通过一定方式可使其不比平铺代理的基本情况更难。同时提到利用临时自我模型在决策节点解决可能出现的5和10问题,但作者不确定自己是否遗漏了某些内容。

🧐平铺代理在逻辑上比其后继代理更强大,虽当前想象中的平铺代理在计算上存在困难,但可通过使后继者在逻辑上较弱来验证其效用函数是否保持某些不变量。

🤔5和10问题中,若选择5美元,需认为5美元比10美元更有价值。作者认为可通过使自己选择5美元的模型在逻辑上比自己弱,来验证其效用函数的不变量。

💡利用临时自我模型在每个可能出现5和10问题的决策节点进行处理,这可能是一种涉及反思的价值检查的方式,但作者不确定是否遗漏了某些内容。

Published on October 11, 2024 9:02 PM GMT

To recap:

Tiling agents are logically more powerful than their successor agents. [ This is how they can verify them even though first-order logics can't verify themselves. ]

The 5-and-10 problem:

"I have to decide between $5 and $10. Suppose I decide to choose $5. I know that I'm a money-optimizer, so if I do this, $5 must be more money than $10, so this alternative is better. Therefore, I should choose $5."

[End recap]

Maybe this is dense, but I've never heard a plausible-sounding story about how the 5-and-10 problem [or analogous problems involving reflective reasoning about "logical counterfactuals"] is supposed to become a problem, that on reflection don't seem to me like they could be solved by augmenting the bolded integument above, with a subjective gut values check to ground all incoming logical proofs of what I choose.

As a limiting case of even the formal version of the problem: Even if I am reasoning about a model of an exact copy of myself having already chosen the $5, the model is going to have to be somewhat simplified anyway since I am not smaller than myself. In the same way that theoretical tiling agents verify that their [smarter] successors' utility functions maintain certain invariants by making them [the successors] logically weaker, I should just as well be able to make my model of myself as having picked the $5 [which is the same intelligence as I am] logically weaker than myself, and verify whether or not its utility function has just as many invariants maintained.

Tiling agents as currently imagined are computationally intractable for the obvious reason that if you want a chain of N successors all maintaining the invariants, your zeroth successor has to have ~N degrees of logical power [depending on how you count].

Verifying a temporary self-model, on the other hand, that is used to make a decision and then discarded, is just the "base case" of the tiling-agent idea - you only need one level of logical remove on the "stack".

The foregoing pseudo-formalism is one thing that could bear load in one possible type of values check involving reflection. I have no idea what else you would need to implement a reflective values check, and I certainly don't consider myself to have solved how to actually implement the base case of tiling agents in practical best-practiced terms! But in any case, unless I'm missing something, 5-and-10 doesn't seem any harder than the very base case of Tiling Agents.

Benja Fallenstein seems to have noted in a 2014 paper that Tiling Agents beats 5-and-10:

However, there is an important difference, which makes Slepnev’s system vulnerable to what is called the “5-and-10 problem”, whereas Yudkowsky and Herreshoff’s system is not. Essentially, this is because the latter system only considers what happens if the action is taken that the agent actually ends up taking, whereas Slepnev’s system also considers what would have happened if the agent had taken a different action.

[emphases mine]

I see no reason this could not be generalized, using temporary self-models at every decision-junction where 5-and-10 would otherwise become a problem.

However, I could still be missing something.

Am I missing something?



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

平铺代理 5和10问题 逻辑验证 临时自我模型
相关文章