少点错误 2024年07月02日
Probabilistic Logic <=> Oracles?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

概率支付者引理(PPL)提出了一种新的合作策略,它将合作与否的决策建立在代理人对合作的概率信念之上。该策略将代理人对合作的信念与他们的合作阈值进行比较,当信念超过阈值时,代理人就会选择合作。这种策略避免了传统合作博弈中的囚徒困境,并提供了更符合现实行为的合作模式。

🤔 **概率支付者引理(PPL)提出了一种基于概率信念的合作策略。** PPL将代理人对合作的信念与他们的合作阈值进行比较,当信念超过阈值时,代理人就会选择合作。这种策略避免了传统合作博弈中的囚徒困境,并提供了更符合现实行为的合作模式。 该策略认为,代理人会根据自身对合作的概率信念来决定是否合作。每个代理人都有一个合作阈值,如果他们对合作的概率信念超过了这个阈值,他们就会选择合作。PPL表明,只要所有代理人对合作的概率信念都超过了各自的合作阈值,那么所有代理人都会选择合作。

🤖 **PPL与自指概率逻辑之间的联系。** PPL依赖于自指概率逻辑,该逻辑允许代理人对自身的信念进行概率判断。然而,自指概率逻辑存在一些问题,例如它只能对不可证实的命题进行概率判断。 为了克服自指概率逻辑的局限性,本文提出了使用“反射性预言机”来模拟代理人的概率信念。反射性预言机是一种能够接收查询并返回概率结果的函数。通过使用反射性预言机,我们可以避免自指概率逻辑中的问题,并获得与PPL相似的合作策略。

💡 **反射性预言机作为模拟概率信念的工具。** 反射性预言机可以模拟代理人的概率信念,并提供与自指概率逻辑类似的合作策略。反射性预言机通过接收查询来模拟代理人的概率信念,并返回相应的概率结果。 反射性预言机可以模拟其他预言机,并提供对其他预言机的概率判断。这种机制可以避免无限回归的问题,并确保合作策略的稳定性。通过使用反射性预言机,我们可以获得与PPL类似的合作策略,并且可以避免自指概率逻辑中的问题。

🤝 **PPL与合作策略的应用。** PPL可以应用于各种合作场景,例如多人囚徒困境、协商、合作游戏等。通过使用PPL,我们可以设计出更加合理和有效的合作策略,并促进代理人之间的合作。 PPL的应用可以帮助我们理解和预测代理人的合作行为,并设计出更加有效的合作机制。例如,在多人囚徒困境中,我们可以使用PPL来设计一种合作策略,使得所有代理人都会选择合作,从而获得最佳的集体利益。

🏆 **PPL的优势。** 与传统的合作策略相比,PPL具有以下优势: 1. PPL更容易证明,并且避免了传统合作策略中的囚徒困境。 2. PPL更加符合现实行为,因为它考虑了代理人对合作的概率信念。 3. PPL可以应用于各种合作场景,并提供更加合理和有效的合作策略。

🤔 **PPL的局限性。** PPL也存在一些局限性: 1. PPL依赖于自指概率逻辑,该逻辑存在一些问题。 2. PPL的实现需要使用反射性预言机,该预言机的实现比较复杂。 3. PPL的应用范围有限,它不适用于所有类型的合作场景。

🤔 **PPL的未来方向。** PPL的未来研究方向包括: 1. 进一步研究PPL的理论基础,并解决自指概率逻辑中的问题。 2. 开发更加高效和实用的反射性预言机,并将其应用于实际的合作场景。 3. 研究PPL在不同合作场景中的应用,并将其推广到其他领域。

🤔 **PPL的应用场景。** PPL可以应用于以下场景: 1. 多人囚徒困境 2. 协商 3. 合作游戏 4. 分布式系统 5. 人工智能 6. 社会科学

Published on July 1, 2024 5:36 AM GMT

Epistemic status: this is a draft I wrote at the end of MATS that I decided to make public in case that people with more experience with this machinery wanted to give constructive feedback. Is very unpolished!!! And likely quite very wrong in some cases / makes false claims (if you catch them, please let me know!)


The Probabilistic Payor's Lemma implies the following cooperation strategy:

Let be agents in a multiplayer Prisoner's Dilemma, with the ability to return either 'Cooperate' or 'Defect' (which we model as the agents being logical statements resolving to either 'True' or 'False'). Each behaves as follows:

Where represents each individual agents' threshold for cooperation (as a probability in ), returns True if credence in the statement is greater than and the conjunction of represents 'everyone cooperates'. Then, by the PPL, all agents cooperate (provided that all give credence to the cooperation statement greater than each and every 's individual thresholds for cooperation).

This formulation is desirable for a number of reasons: firstly, the Payor's Lemma is much simpler to prove than Lob's Theorem, and doesn't carry with it the same strange consequences as a result of asserting an arbitrary modal-fixedpoint; second, when we relax the necessitation requirement from 'provability' to 'belief', this gives us behavior much more similar to how agents actually I read it as it emphasizing the notion of 'evidence' being important.

However, the consistency of this 'p-belief' modal operator rests on the self-referential probabilistic logic proposed by Christiano 2012, which, while being consistent, has a few undesirable properties: the distribution over sentences automatically assigns probability 1 to all True statements and 0 to all False ones (meaning it can only really model uncertainty for statements not provable within the system).

I propose that we can transfer the intuitions we have from probabilistic modal logic to a setting where 'p-belief' is analogous to calling a 'reflective oracle', and this system gets us similar (or identical) properties of cooperation.

Oracles

A probabilistic oracle is a function from Here, its domain is meant to represent an indexing of probabilistic oracle machines, which are simply Turing machines allowed to call an oracle for input. An oracle can be queried with tuples of the form where is a probabilistic oracle machine and is a rational number between 0 and 1. By Fallenstein et. al. 2015, there exists a reflective oracle on each set of queries such that if and if (check this).

Notice that a reflective oracle has similar properties to the operator in self-referential probabilistic logic. It has a coherent probability distribution over probabilistic oracle machines (as opposed to sentences), it only gives information about the probability to arbitrary precision via queries ( vs. ). So, it would be great if there was a canonical method of relating the two.

Peano Arithmetic is Turing-complete, there exists a method of embedding arbitrary Turing machines in statements in predicate logic and there also exist various methods for embedding Turing machines in PA. We can form a correspondence where implications are preserved: notably, simply represents the program if TM(x), then TM(y) , and negations just make the original TM output 1 where it outputted 0 and vice versa.

(Specifically, we're identifying non-halting Turing machines with propositions and operations on those propositions with different ways of composing the component associated Turing machines. Roughly, a Turing machine outputting 1 on an input is equivalent to a given sentence being true on that input)

CDT, expected utility maximizing agents with access to the same reflective oracle will reach Nash equilibria, because reflective oracles can model other oracles and other oracles that are called by other probabilistic oracle machines---so, at least in the unbounded setting, we don't have to worry about infinite regresses, because the oracles are guaranteed to halt.

So, we can consider the following bot:where is an agent represented by a oracle machine, is the probabilistic oracle affiliated with the agent, is the closure of all agents' oracles, and is an individual probability threshold set by each agent.

How do we get these closures? Well, ideally returns for queries if and if and randomizes for queries in the middle---for the purposes of this cooperation strategy, this turns out to work.

I claim this set of agents has the same behavior as those acting in accordance with the PPL: they will all cooperate if the 'evidence' for cooperating is above each agents' individual threshold In the previous case, the 'evidence' was the statement Here, the evidence is the statement

To flesh out the correspondence further, we can show that the relevant properties of the -belief operator are found in reflective oracles as well: namely, that instances of the weak distribution axiom schema are coherent and that necessitation holds.

For necessitation, turns into implying that which is true by the properties of reflective oracles. For weak distributivity, can be analogized to 'if it is true that the Turing machine associated with outputs 1 implies that the Turing machine associated with outputs 1, then you should be at least -certain that -outputs 1, so should imply in all cases (because oracles represent true properties of probabilistic oracle machines, which Turing machines can be embedded into).

Models

Moreover, we can consider oracles to be a rough model of the p-belief modal language in which the probabilistic Payor's Lemma holds. We can get an explicit model to ensure consistency (see the links with Christiano's system, as well as its interpretation in neighborhood semantics), but oracles seem like a good intuition pump because they actively admit queries of the same form as and they are a nice computable analog.

They're a bit like the probabilistic logic in the sense that a typical reflective oracle just has full information about what the output of a Turing machine will be if it halts, and the probabilistic logic gives to all sentences which are deducible from the set of tautologies in the language. So the correspondence has some meat.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

概率支付者引理 合作策略 概率信念 自指概率逻辑 反射性预言机 囚徒困境
相关文章
Show HN: 囚徒困境在线锦标赛(编程挑战赛)
多家车企集体退出价格战,汽车价格战为什么卷不动了?
对于大多数工业品而言,其生产制造往往具备规模经济。对一个竞争者,市场份额的丧失不仅增加了他的生产成本,而且也减少了他的竞争者的生产成本。这样的一升一降...
$赛力斯(SH601127)$ 这个周末两天,连续喝了三场过期酱香饮料,除了早餐,基本泡在酱香饮料里了。有一场托赛力斯/江淮朋友的服,还是96年的,快过期30年了,酒液...
$平煤股份(SH601666)$ 实话说,平煤这个中报并不算差,做了平滑处理的概率还挺大,但是市场很不买账,归根结底就是持有的机构太多。现在机构重仓的,全都是囚徒...
从7月最新金融数据,我们看到的是,居民的缩表行为大大超出了预期。居民的超预期缩表并不是重点。我们关注的重点是,上面打算怎么做。无论是从过去李总的“固本...
opec+为什么明知道石油过剩还要在减产那么久后增产呢?其实,这是一个囚徒困境的博弈……因为现在油价还在高位,你不增产,其它非opec国家都在拼命增产抢收,赚...
汽车价格战逼疯供应商:欠款、贴钱研发、被迫降价
国际酒店CEO们,豪赌中国市场
今天,海关发通告调节了出口退税税率以下:一、取消铝材、铜材以及化学改性的动、植物或微生物油、脂等产品出口退税。二、将部分成品油、光伏、电池、部分非金属...