少点错误 2024年08月02日
Marketic Agents project intro
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了市场与通用人工智能(AGI)之间的类比关系,认为市场可以作为一种“简单程序的市场”来构建AGI。文章分析了市场与智能体在决策和信念方面的相似性,以及市场在规模、泛化能力、元学习和有限理性等方面的独特优势。作者认为,市场可以实现分布式智能,通过简单的程序构建出复杂的智能体,并以市场为灵感,为人工智能的设计提供新的思路。

🤔 市场与智能体在决策和信念方面具有相似性。它们都计算决策(以商品分配的形式)和信念(例如,通过预测市场)。信念可能不一致(套利)或不完整,并且由于简单的计数论证,它们必须是不完整的。

📈 市场具有可变的规模,没有固定的结构,可以根据任务的复杂性动态调整规模。与神经网络相比,市场可以适应不同规模的任务,并通过动态调整商品和服务的供应来实现规模的扩展。

🧠 市场能够进行持续学习和元学习。它可以学习各种不同的任务,并根据新的环境变化进行调整。市场通过竞争和合作,不断优化商品和服务的供应,从而实现持续的学习和改进。

🤖 市场可以作为一种通向AGI的潜在途径。通过将市场机制应用于人工智能的设计,可以构建出具有分布式智能、可变规模和持续学习能力的AGI系统。

🤝 市场为人工智能的对齐问题提供了一种新的视角。市场本身就是一个对齐的超智能系统,它可以作为一种“灵感泵”,帮助我们设计出对齐的AGI系统。

🧠 市场中的有限理性可以看作是一种学习过程。市场中的每个参与者都是一个有限理性的学习者,他们通过不断地调整自己的行为来适应市场环境,最终实现整体的优化。

📊 市场可以提供一种新的视角来理解人工智能的泛化能力。市场的泛化能力来自于它对各种不同任务和环境的适应能力,而这种适应能力来自于市场参与者之间的竞争和合作。

🚀 市场可以为人工智能的发展提供新的方向。通过将市场机制应用于人工智能的设计,我们可以构建出更加灵活、高效、可扩展的人工智能系统。

💡 市场可以作为一种新的工具来解决人工智能的对齐问题。市场本身就是一个对齐的超智能系统,它可以作为一种“灵感泵”,帮助我们设计出对齐的AGI系统。

🧠 市场可以作为一种新的框架来理解人工智能的学习过程。市场的学习过程是一个分布式的过程,它是由市场参与者之间的竞争和合作驱动的。

📊 市场可以提供一种新的视角来理解人工智能的评估方法。市场的评估方法是基于市场参与者对商品和服务的评价,而这种评价是基于市场参与者的个人需求和偏好。

🤝 市场可以作为一种新的平台来促进人工智能的协作。市场可以为人工智能开发者提供一个开放的平台,让他们可以分享他们的成果,并共同推动人工智能的发展。

🚀 市场可以为人工智能的未来发展提供新的可能性。通过将市场机制应用于人工智能的设计,我们可以构建出更加灵活、高效、可扩展的人工智能系统。

💡 市场可以作为一种新的工具来解决人工智能的伦理问题。市场本身就是一个具有自我调节机制的系统,它可以帮助我们解决人工智能带来的伦理挑战。

🧠 市场可以作为一种新的框架来理解人工智能的社会影响。市场是一个复杂的系统,它反映了人类社会的各种关系,而人工智能的发展会对市场产生深远的影响。

📊 市场可以提供一种新的视角来理解人工智能的经济价值。市场可以为人工智能创造新的商机,并推动人工智能产业的发展。

🤝 市场可以作为一种新的平台来促进人工智能的应用。市场可以为人工智能开发者提供一个开放的平台,让他们可以分享他们的成果,并共同推动人工智能的应用。

🚀 市场可以为人工智能的未来发展提供新的可能性。通过将市场机制应用于人工智能的设计,我们可以构建出更加灵活、高效、可扩展的人工智能系统。

💡 市场可以作为一种新的工具来解决人工智能的安全性问题。市场本身就是一个具有自我调节机制的系统,它可以帮助我们解决人工智能带来的安全挑战。

🧠 市场可以作为一种新的框架来理解人工智能的认知能力。市场的认知能力来自于市场参与者之间的相互作用,而这种相互作用是基于市场参与者的个人经验和知识。

📊 市场可以提供一种新的视角来理解人工智能的决策过程。市场的决策过程是一个分布式的过程,它是由市场参与者之间的竞争和合作驱动的。

🤝 市场可以作为一种新的平台来促进人工智能的协作。市场可以为人工智能开发者提供一个开放的平台,让他们可以分享他们的成果,并共同推动人工智能的发展。

🚀 市场可以为人工智能的未来发展提供新的可能性。通过将市场机制应用于人工智能的设计,我们可以构建出更加灵活、高效、可扩展的人工智能系统。

💡 市场可以作为一种新的工具来解决人工智能的公平性问题。市场本身就是一个具有自我调节机制的系统,它可以帮助我们解决人工智能带来的公平挑战。

🧠 市场可以作为一种新的框架来理解人工智能的社会责任。市场是一个复杂的系统,它反映了人类社会的各种关系,而人工智能的发展会对市场产生深远的影响。

📊 市场可以提供一种新的视角来理解人工智能的法律问题。市场可以为人工智能创造新的法律问题,并推动人工智能法律法规的完善。

🤝 市场可以作为一种新的平台来促进人工智能的应用。市场可以为人工智能开发者提供一个开放的平台,让他们可以分享他们的成果,并共同推动人工智能的应用。

🚀 市场可以为人工智能的未来发展提供新的可能性。通过将市场机制应用于人工智能的设计,我们可以构建出更加灵活、高效、可扩展的人工智能系统。

💡 市场可以作为一种新的工具来解决人工智能的隐私问题。市场本身就是一个具有自我调节机制的系统,它可以帮助我们解决人工智能带来的隐私挑战。

🧠 市场可以作为一种新的框架来理解人工智能的道德问题。市场是一个复杂的系统,它反映了人类社会的各种关系,而人工智能的发展会对市场产生深远的影响。

📊 市场可以提供一种新的视角来理解人工智能的文化影响。市场可以为人工智能创造新的文化现象,并推动人工智能文化的发展。

🤝 市场可以作为一种新的平台来促进人工智能的应用。市场可以为人工智能开发者提供一个开放的平台,让他们可以分享他们的成果,并共同推动人工智能的应用。

🚀 市场可以为人工智能的未来发展提供新的可能性。通过将市场机制应用于人工智能的设计,我们可以构建出更加灵活、高效、可扩展的人工智能系统。

Published on August 1, 2024 8:58 PM GMT

I believe that we can build AGI as “markets of simpler programs”. In particular we should, because markets are an example of real-life Aligned Superintelligence, and they serve as a source for intuition-pumping for alignment.

Contents / summary:

Analogies between markets and agents

The basic point is that

    Markets are boundedly rational agentsYou can actually get intelligence as an emergent property of markets (rather than the intelligence being just from the intelligence of the individual participants in the market). So you could imagine just building a market out of really “dumb” / simple programs, and the resulting market would be an intelligent agent.

The first point is clearly true; the latter could be more controversial. But (1) the idea of a multi-agent basis of mind/intelligence is not original to me, and has precedent in the literature — see “Related work” for details (2) this is intuitively very plausible, right? Think e.g. the “I, Pencil” essay: no human is intelligent / capable enough to single-handedly produce a pencil, yet the market distributes this highly complex task amongst many dumb humans. Specifically, markets achieve this level of distributedness via two methods:

    Modularity of action — instead of a single agent that completes the entire task , this action is decomposed into many smaller tasks , with each little step optimized separately (by selecting the best agent for the subtask).Modularity of state — instead of handing the entire state  to an agent to transform it, we factor the state into dimensions called “goods”: . See “What are markets? The math” for details.

Cool features of markets

Markets serve as an example of “already-existing (aligned) AGI”, so they serve as a nice setting for us to play with concepts of intelligence and alignment.

So even if we don’t want to build market-based agents, markets serve as a valuable intuition pump for the AI agents we do build.

Some interesting properties of markets / concepts of intelligence in the context of markets —

Variable scale

One thing bad about neural networks is that they have “fixed depth”. ChatGPT finds the prompts “Complete this sentence: Eenie Meenie Minie Moe, Catch the …” and “Debug this fairly complex piece of code” to be equally hard / takes equal number of computational steps.

I imagine this means that increasingly complex tasks just require more and more scale, and thus impractical amounts of resources.

(Note: this doesn’t mean LLMs are safe, because at some amount of scale the LLM can just become smart enough to build a more powerful AI framework, like marketic agents, or something like marketic agents but bad.)

Markets, by contrast, do not necessarily have a fixed structure. In the simplest example of a market (see “What are markets? The math”), you just have “the state of the world” getting auctioned to successive agents, which add value to it in some way. Depending on this initial state, the number of steps it goes through before being bought by the consumer (i.e. earning reward), and the computational cost at each step (the wage paid to the agent) could be completely different.

There is no upper bound on the scale / no “number of parameters” as such, because markets kinda do their own hyperparameter-optimization dynamically as we will discuss shortly.

You can get variable scale with LLMs by patching them together into an agentic workflow, but there’s no “optimization” on this like structure like there is within the LLM.

Generalization and Continual Learning

I’m going to say a weird thing: out-of-distribution generalization is a form of within-distribution generalization. — but a type of generalization that gradient descent is not very good at.

What does this mean?

Suppose you expose a learning algorithm to a stream of “varying” training data. Then ideally it should learn the variation in the training data and learn to be “adaptable” to the sort of variation present. Evolutionary algorithms do this to an extent: humans evolved to be adaptable, rather than overfit to every new twist of environmental fate, because the overfitting guys kept getting culled while the adaptable guys could pass all the tests.

Gradient descent doesn’t really do this: it is subject to catastrophic forgetting. If you optimize loss function , then optimize loss function , there is no sense in which you optimize the average loss function , or learn to check a few samples to check which environment we’re in, or anything like that: it simply myopically follows the direction given by the current gradient.

(Well, actually you can do this if you’re optimizing each loss function for like 1 step or a few — i.e. SGD — because you haven’t gone too far yet so the gradients live in the same tangent space. But the point is that gradient descent cannot learn “global” or “long-term” trends in the data stream.)

Markets, I think, are capable of continual learning, at least as long as you don’t let monopolies form.

Meta-learning and derived demand

Let’s think about the Meta learning problem. Suppose  is a task (distribution to learn). But in particular, the task  is itself sampled from some distribution  — equivalently,  really depends on some other random variable , i.e. is really a distribution . Then the meta-learning goals are to learn  and .

Now suppose you’re a skyscraper builder. Building skyscrapers entails some task  (a value function on possible actions) — but this task depends on local circumstances, like earthquake frequency, soil type, and nimby infestation which may be represented by some other random variable . The builder must learn , while  is learned by providers of some information, or more generally you could say  is learned by producers of factor goods (including information).

Observe that the choice of such  is itself “learned”. The market not only learns, but also chooses what to learn. There is also no fundamental difference between meta-learning parameters and any other parameters that the market learns to learn.

In other words: stuff like hyperparameter optimization happens very “naturally” in markets, they simply appear as markets for factor goods.

Bounded rationality and pagerank

The current market probability for Trump winning the 2024 election is 60%.

Is this the rational probability? Well, if you took a perfect brain simulation of everyone who will vote in the 2024 election, you could probably come up with a much better estimate.

Instead, 60% is just the best we can do with the information — including algorithmic (e.g. logical) information — available. The fact that we choose not to take a perfect brain simulation etc. is another market decision — the market decision for “quantity of resources allocated to simulating voter brains”, which is low because we determined that “information on who will win this election” was not really worth that (impossibly high of a) cost.

But again, this market-calculated quantity is not perfect either — the price offered by the market for computational resources is, again, a price computed by a market of imperfect agents.

Similarly, the marketplaces / institutions that agents choose to trade on are also services that are chosen by agent demand, etc.

One reductive way to look at the problem of bounded rationality is that it’s just learning: that a boundedly rational agent is just a learning agent. See e.g. “asymptotic bounded optimality” in Stuart Russell (2016), Rationality and Learning: a brief update. It’s ok that the agent isn’t optimal, as long as it eventually learns to be. It’s ok that the agent’s allocation of resources to this pursuit of optimality isn’t optimal, as long as it eventually learns to be. Ad infinitum.

Markets behave the same way: one agent’s reward mechanism is another agent’s behaviour, which is trained by its own reward mechanism, which is yet another agent’s behaviour, ad infinitum … and there are optimization pressures at each level.

What are markets? The math

Some excerpts from a paper I have in progress. This “definition of a market” is valid in a setting that subsumes any MDP (proof left as an exercise), and subsumes the existing market-based RL toys listed in “Related work” (proof left as an exercise).

Observe that there is no fixed market graph or structure — this graph is generated/learned on the fly. The class of agents is not specified, and should be chosen so that a universal approximation theorem exists.

We may also consider markets where instead of transacting the entire “state” through the economy, the state is modularized into “goods” that are transacted separately. This way, agents can specialize not only on activation-conditions and actions, but also on what parts of the state to observe and act on.

(For neatness we’re pretending that there is some “general equilibrium computation algorithm” Equ — in reality you could imagine other mechanisms, e.g. asking for independent demand schedules for each good and computing their equilibria independently, so that the job of marginalizing their demand schedules on their estimated prices of other goods are left to the agents. You might generally have other mechanisms, requiring perhaps different signals from the agents, and it’s not clear to me exactly how to “abstract this out”. Perhaps the mechanism itself should be a good, IDK.)

Markets and backprop

What’s perhaps interesting with goods markets is that — although the market algorithm is not backpropagation, indeed the market does not even have a fixed structure to simply differentiate — at equilibrium, prices obey a certain chain rule relation.

i.e. if the agent can estimate what the market prices of its output goods will be (e.g. if prices are sufficiently stable to speak of a “prevailing price” ), then it can simply compute its offered price via the chain rule. Where  denotes the Jacobian of its production function:

 I guess this is the multivariable form of the claim made by John Wentworth in competitive markets as distributed backprop.

So prices are being backpropagated, and prices are derivatives (of benefit/cost). The offered price of steel being £1/kg tells you that there is £1 benefit to the world of increasing the production of steel by 1kg, the cost price of steel being £1/kg tells you there is £1 cost to the world of increasing the production of steel by 1kg; both of these are signals to change the production, albeit they cancel out.

BTW this means that at equilibrium  is an eigenvector of . I think this is a formalization of the “pagerank” connection I touched on earlier.

But key is that this backpropagation is not the only optimization the market does — it also simultaneously optimizes the whole market structure!

Related work

A brief bibliography of work in this area I’m aware of:

Classifier systems and citations thereof

Garrabrant induction and citations thereof

“Philosophical” commentary

Outlook and call for collaborators

So market-based agents are a promising idea for AGI, and we should develop AGI in the form of marketic agents. There are at least three schticks missing in this line of argumentation:

I think the main obstacle for getting marketic agents to work is that information markets are inefficient. They’re subject to buyer’s inspection/imperfect information and positive externalities (that’s the issue with prediction markets — sure, you can just subsidise one market with your “price for information” — but how can agents subsidize “subsidiary markets” that will help them in their forecast on your market, without creating positive externalities for someone else?). The Baum paper noticed this too — that you can’t have agents communicate useful information to other agents, like “in this scenario, don’t do X”.

This problem needs solving. Perhaps with an information bazaar (LLMs can act like a guy under an amnestic, so they can just inspect the information, make a purchase decision, then forget the information), or perhaps with “latent variable prediction markets”. Maybe combinatorial prediction markets are a relevant concept — is anyone familiar enough with the literature to tell me if they can do latent variable discovery?

A good starting to-do for this project will look something like this:

This will be a big project — I’m literally demanding that all AI work everywhere be reoriented towards marketic agents, and the TO-DO above is the bare minimum I need to do to produce valuable results to even convince anyone marketic agents is even a worthwhile research area.

If you think this is something you can meaningfully work on (if reading this post “struck a chord”, with related thoughts you’ve had before, if you have an economist’s intuitions), please reach out.

This project will be the greatest human achievement since the invention of agriculture. If we succeed, we will solve not only AI alignment, but also basically all of economics as well as philosophy.


  1. ^

    Anyone who has struggled a bit philosophically with the problem of bounded rationality understands how similar it is to the troubles with the Efficient Market Hypothesis. Perfect rationality (“just maximize utility bro”) is uncomputable when such a maximization itself involves computational costs so instead you’re always following some heuristics. But you want some notion of being able to say these heuristics are “boundedly rational”, i.e. “rationality conditional on the available logical (or more generally algorithmic) information”. This concept degenerates to tautology quickly), but some inspiration comes from markets, where even the efficient market hypothesis is only true “conditional on the available algorithmic information” — i.e. computing arbitrage opportunities, adding and propagating information is not costless, the market adequately rewards such behaviour for the value created but there’s some restricted supply of this behaviour (which is the entire function of financial markets).

  2. ^

    This also means markets are subject to Gödelian incompleteness: the market cannot correctly price, at time t, the asset “this market will have market probability <0.4 at time t”.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

通用人工智能 市场 对齐 智能体 学习
相关文章