少点错误 2024年07月11日
Decomposing Agency — capabilities without desires
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了人工智能智能体(Agent)的构成要素,指出智能体并非一个单一的实体,而是可以被分解成多个相互关联的组件。作者将智能体分为目标、执行能力、情境意识和规划能力四个方面,并以现实世界中的例子说明这些组件如何被拆解和组合,最终形成一个完整的智能体。文章还探讨了人工智能技术如何影响这些组件的实现,以及未来智能体可能呈现出的形态。

🤔 智能体并非单一实体,而是可以分解成多个组件,包括目标、执行能力、情境意识和规划能力。 智能体并非一个不可分割的整体,而是可以被分解成多个相互关联的组件。这些组件可以独立存在,也可以组合在一起,形成一个完整的智能体。例如,一个智能体可以包含一个负责设定目标的组件、一个负责执行任务的组件、一个负责感知环境的组件,以及一个负责规划行动的组件。 作者以现实世界中的例子来说明这些组件如何被拆解和组合。例如,一个公司可以被看作是一个智能体,它的目标是盈利,它的执行能力包括生产产品、销售产品、管理员工等,它的情境意识包括市场状况、竞争对手情况等,它的规划能力包括制定战略、制定预算等。 作者还指出,在现实世界中,智能体经常被分解成多个组件,并由不同的实体负责。例如,一个公司可能将产品生产外包给其他公司,将销售外包给其他公司,将市场营销外包给其他公司。这表明,智能体可以被分解成多个组件,并由不同的实体负责。

💡 人工智能技术可以影响智能体组件的实现方式。 人工智能技术的发展,特别是深度学习技术的进步,为智能体组件的实现提供了新的可能性。例如,人工智能系统可以被用于实现智能体的执行能力、情境意识和规划能力。 人工智能系统可以用于实现智能体的执行能力。例如,人工智能系统可以被用于控制机器人的行动,可以被用于自动驾驶汽车,可以被用于自动交易系统。 人工智能系统可以被用于实现智能体的情境意识。例如,人工智能系统可以被用于分析数据,可以被用于识别图像,可以被用于理解自然语言。 人工智能系统可以被用于实现智能体的规划能力。例如,人工智能系统可以被用于制定计划,可以被用于预测未来,可以被用于决策。

🚀 未来智能体可能呈现出多种形态。 作者认为,未来智能体可能呈现出多种形态。例如,未来智能体可能是一个由多个独立组件组成的系统,每个组件负责不同的功能。例如,未来智能体可能是一个由多个 AI 代理组成的系统,每个代理负责不同的任务。 作者还认为,未来智能体可能与人类更加紧密地合作。例如,未来智能体可能成为人类的助手,帮助人类完成各种任务。 作者认为,未来智能体将对人类社会产生重大影响。例如,未来智能体将改变我们的工作方式,将改变我们的生活方式,将改变我们的社会结构。

🤖 AI 将改变智能体的构成方式,带来新的机遇和挑战。 作者认为,人工智能技术将改变智能体的构成方式,带来新的机遇和挑战。例如,人工智能技术将使智能体更加强大,将使智能体更加灵活,将使智能体更加智能。 但作者也指出,人工智能技术也会带来一些挑战。例如,人工智能技术可能会被滥用,可能会对人类造成威胁。因此,我们必须谨慎地开发和使用人工智能技术,确保人工智能技术能够为人类社会带来福祉。

🎯 人工智能技术将改变智能体的构成方式,带来新的机遇和挑战。

💬 未来智能体的形态将更加多元化,人工智能技术将为人类社会带来巨大影响。

🧠 人类需要谨慎地开发和使用人工智能技术,确保人工智能技术能够为人类社会带来福祉。

🧠 未来智能体将改变我们的生活方式,我们需要思考如何与这些智能体共存。

🧠 人工智能技术将改变我们的工作方式,我们需要思考如何适应这些变化。

Published on July 11, 2024 9:38 AM GMT

What is an agent? It’s a slippery concept with no commonly accepted formal definition, but informally the concept seems to be useful. One angle on it is Dennet’s Intentional Stance: we think of an entity as being an agent if we can more easily predict it by treating it as having some beliefs and desires which guide its actions. Examples include cats and countries, but the central case is humans.

The world is shaped significantly by the choices agents make. What might agents look like in a world with advanced — and even superintelligent — AI? A natural approach for reasoning about this is to draw analogies from our central example. Picture what a really smart human might be like, and then try to figure out how it would be different if it were an AI. But this approach risks baking in subtle assumptions — things that are true of humans, but need not remain true of future agents.  

One such assumption that is often implicitly made is that “AI agents” is a natural class, and that future AI agents will be unitary — that is, the agents will be practically indivisible entities, like single models. (Humans are unitary in this sense, and while countries are not unitary, their most important components — people — are themselves unitary agents.)

This assumption seems unwarranted. While people certainly could build unitary AI agents, and there may be some advantages to doing so, unitary agents are just an important special case among a large space of possibilities for:

We’ll begin an exploration of these spaces. We’ll consider four features we generally expect agents to have[1]:

We don’t necessarily expect to be able to point to these things separately — especially in unitary agents they could exist in some intertwined mess. But we kind of think that in some form they have to be present, or the system couldn’t be an effective agent. And although these features are not necessarily separable, they are potentially separable — in the sense that there exist possible agents where they are kept cleanly apart.

We will explore possible decompositions of agents into pieces which contain different permutations of these features, connected by some kind of scaffolding. We will see several examples where people naturally construct agentic systems in ways where these features are provided by separate components. And we will argue that AI could enable even fuller decomposition.

We think it’s pretty likely that by default advanced AI will be used to create all kinds of systems across this space. (But people could make deliberate choices to avoid some parts of the space, so “by default” is doing some work here.)

A particularly salient division is that there is a coherent sense in which some systems could provide useful plans towards a user's goals, without in any meaningful sense having goals of their own (or conversely, have goals without any meaningful ability to create plans to pursue those goals). In thinking about ensuring the safety of advanced AI systems, it may be useful to consider the advantages and challenges of building such systems.

Ultimately, this post is an exploration of natural concepts. It’s not making strong claims about how easy or useful it would be to construct particular kinds of systems — it raises questions along these lines, but for now we’re just interested in getting better tools for thinking about the broad shape of design space. If people can think more clearly about the possibilities, our hope is that they’ll be able to make more informed choices about what to aim for.

Familiar examples of decomposed agency

Decomposed agency isn’t a new thing. Beyond the complex cases of countries and other large organizations, there are plenty of occasions where an agent uses some of the features-of-an-agent from one system, and others from another system. Let’s look at these with this lens.

To start, here’s a picture of a unitary agent:

They use their planning capacity to make plans, based on both their goals and their understanding of the situation they’re in, and then they enact those plans.

But here’s a way that these functions can be split across two different systems:

In this picture, the actor doesn’t come up with plans themselves — they outsource that part (while passing along a description of the decision situation to the planning advisor).

People today sometimes use coaches, therapists, or other professionals as planning advisors. Although these advisors are humans who in some sense have their own goals, professional excellence often means setting those aside and working for what the client wants. ChatGPT can also be used this way. It doesn’t have an independent assessment of the user’s situation, but it can suggest courses of action.

Here’s another way the functions can be split across two systems:

People often use management consultants in something like this role, or ask friends or colleagues who already have situational awareness for advice. Going to a doctor for tests and a diagnosis that they use to prescribe home treatment is a case of using them as a planning oracle. The right shape of AI system could help similarly — e.g. suppose that we had a medical diagnostic AI which was also trained on which recommendations-to-patients produced good outcomes.

The passive actor in this scenario need not be a full agent. One example is if the actor is the legal entity of a publicly traded firm, and the planning oracle is its board of directors. Even though the firm is non-sentient, it comes with a goal (maximize shareholder value), and the board has a fiduciary duty to that goal. The board makes decisions on that basis, and the firm takes formal actions as a result, like appointing the CEO. (The board may get some of its situational awareness from employees of the firm, or further outsource information gathering, e.g. to a headhunting firm.)

Here’s another possible split:

Whereas a pure tool (like a spade, or an email client configured just to send mail) might provide just implementation capacity, an agentic tool does some of the thinking for itself. Alexa or Siri today are starting to go in this direction, and will probably go further (imagine asking one of them to book you a good restaurant in your city catering to particular dietary requirements). Lots of employment also looks somewhat like this: an employer asks someone to do some work (e.g. build a website to a design brief). The employee doesn’t understand all of the considerations behind why this was the right work to do, but they’re expected to work out for themselves how to deal with challenges that come up.

(In these examples the agentic tool is bringing some situational awareness, with regard to local information necessary for executing the task well, but the broader situational awareness which determined the choice of task came from the user.)

And here’s a fourth split:

One archetypal case like this is a doctor, working to do their best by the wishes of a patient in a coma. Another would be the executors of wills. In these cases the scaffolding required is mostly around ensuring that the incentives for the autonomous agent align with the goals of the patient.

(A good amount of discussion of aligned superintelligent AI also seems to presume something like this setup.)

AI and the components of agency

Decomposable agents today arise in various situations, in response to various needs. We’re interested in how AI might impact this picture. A full answer to that question is beyond the scope of this post. But in this section we’ll provide some starting points, by discussing how AI systems today or in the future might provide (or use) the various components of agency.

Implementation capacity

We’re well used to examples where implementation capacity is relatively separable and can be obtained (or lost) by an agent. These include tools and money[2] as clear-cut examples, and influence and employees[3] as examples which are a little less easily separable.

Some types of implementation capacity are particularly easy to integrate into AI systems. AI systems today can send emails, run code, or order things online. In the future, AI systems could become better at managing a wider range of interfaces — e.g. managing human employees via calls. And the world might also change to make services easier for AI systems to engage with. Furthermore, future AI systems may provide many novel services in self-contained ways. This would broaden the space of highly-separable pieces of implementation capacity.

Situational awareness

LLMs today are good at knowing lots of facts about the world — a kind of broad situational awareness. And AI systems can be good at processing data (e.g. from sensors) to pick out the important parts. Moreover AI is getting better at certain kinds of learned interpretation (e.g. medical diagnosis). However, AI is still typically weak at knowing how to handle distribution shifts. And we’re not yet seeing AI systems doing useful theory-building or establishing novel ontologies, which is one important component of situational awareness.

In practice a lot of situational awareness consists of understanding which information is pertinent[4]. It’s unclear that this is a task at which current AI excels; although this may in part be a lack of training. LLMs can probably provide some analysis, though it may not be high quality.

Goals

Goals are things-the-agent-acts-to-achieve. Agents don’t need to be crisp utility maximisers — the key part is that they intend for the world to be different than it is.

In scaffolded LLM agents today, a particular instance of the model is called, with a written goal to achieve. This pattern could continue — decomposed agents could work with written goals[5].

Alternatively, goals could be specified in some non-written form. For example, an AI classifier could be trained to approve of certain kinds of outcome, and then the goal could specify trying to get outcomes that would be approved of by this classifier. Goals could also be represented implicitly in an RL agent.

(How goals work in decomposed agents probably has a lot of interactions with what those agents end up doing — and how safe they are.)

Planning capacity

We could consider a source of planning capacity as a function which takes as inputs a description of a choice situation and a goal, and outputs a description of an action which will be (somewhat) effective in pursuit of that goal.

AI systems today can provide some planning capacity, although they are not yet strong at general-purpose planning. Google Maps can provide planning capacity for tasks that involve getting from one place to another. Chatbots can suggest plans for arbitrary goals, but not all of those plans will be very good.

Planning capacity and ulterior motives

When we use people to provide planning capacity, we are sometimes concerned about ulterior motives — ways in which the person’s other goals might distort the plans produced. Similarly we have a notion of “conflict of interest” — roughly, that one might have difficulty performing the role properly on account of other goals.

How concerned should we be about this in the case of decomposed agents? In the abstract, it seems entirely possible to have planning capacity free from ulterior motives. People are generally able to consider hypotheticals divorced from their goals, like "how would I break into this house" — indeed, sometimes we use planning capacity to prepare against adversaries, in which case the pursuit of our own goals requires that we be able to set aside our own biases and values to imagine how someone would behave given entirely different goals and implementation capacity.

But as a matter of practical development, it is conceivable that it will be difficult to build systems capable of providing strong general-purpose planning capacity without accidentally incorporating some goal-directed aspect, which may then have ulterior motives. Moreover, people may be worried that the system developers have inserted ulterior motives into the planning unit.

Even without particular ulterior motives, a source of planning capacity may impose its own biases on the plans it produces. Some of these could seem value-laden — e.g. some friends you might ask for advice would simply never consider suggesting breaking the law. However, such ~deontological or other constraints on the shape of plans are unlikely to blur into anything like active power-seeking behaviour — and thus seem much less concerning than the general form of ulterior motives.

Scaffolding

Scaffolding is the glue which holds the pieces of the decomposed agent together. It specifies what data structures are used to pass information between subsystems, and how they are connected. This use of “scaffolding” is a more general sense of the same term that is used for structures around LLMs to turn them into agents (and perhaps let them interface with other systems like software tools).

Scaffolding today includes the various UIs and APIs that make it easy for people or other services to access the kind of decomposed functionality described in the sections above. Underlying technologies for scaffolding may include standardized data formats, to make it easy to pass information around. LLMs allow AI systems to interact with free text, but unstructured text is often not the most efficient way for people to pass information around in hierarchies, and so we suspect it may also not be optimal for decomposed agents. In general it’s quite plausible that the ability to build effective decomposed agents in the future could be scaffolding-bottlenecked.

Some questions

All of the above tells us something about the possible shapes systems could have. But it doesn’t tell us so much about what they will actually look like.

We are left with many questions.

Possibility space

We’ve tried to show that there is a rich space of (theoretically) possible systems. We could go much deeper on understanding this:

Efficiency

What is efficient could have a big impact on what gets deployed. Can we speak to this?

Safety

People have various concerns about AI agents. These obviously intersect with questions of how agency is instantiated by AI systems:

So what?

Of all the ways people anthropomorphize AI, perhaps the most pervasive is the assumption that AI agents, like humans, will be unitary.

The future, it seems to us, could be much more foreign than that. And its shape is, as far as we can tell, not inevitable. Of course much of where we go will depend on local incentive gradients. But the path could also be changed by deliberate choice. Individuals could build towards visions of the future they believe in. Collectively, we might agree to avoid certain parts of design space — especially if good alternatives are readily available.

Even if we keep the basic technical pathway fixed, we might still navigate it well or poorly. And we're more likely to do it well if we've thought it through carefully, and prepared for the actual scenario that transpires. Some fraction of work should, we believe, continue on scenarios where the predominant systems are unitary. But it would be good to be explicit about that assumption. And probably there should be more work on preparing for scenarios where the predominant systems are not unitary.

But first of all, we think more mapping is warranted. People sometimes say that AGI will be like a second species; sometimes like electricity. The truth, we suspect, lies somewhere in between. Unless we have concepts which let us think clearly about that region between the two, we may have a difficult time preparing.

Acknowledgements

A major source of inspiration for this thinking was Eric Drexler’s work. Eric writes at AI Prospects.

Big thanks to Anna Salamon, Eric Drexler, and Max Dalton for conversations and comments which helped us to improve the piece.

  1. ^

     Of course this isn’t the only way that agency might be divided up, and even with this rough division we probably haven’t got the concepts exactly right. But it’s a way to try to understand a set of possible decompositions, and so begin to appreciate the scope of the possible space of agent-components.

  2. ^

     Money is a particularly flexible form of implementation capacity. However, deploying money generally means making trades with other systems in exchange for something (perhaps other forms of implementation capacity) from them. Therefore, in cases where money is a major form of implementation capacity for an agent, there will be a question of where to draw the boundaries of the system we consider the agent. Is it best if the boundary swallows up the systems that are employed with money, and so regards the larger gestalt as a (significantly decomposed) agent?

    (This isn’t the only place where there can be puzzles about where best to draw the boundaries of agents.)

  3. ^

     We might object “wait, aren’t those agents themselves?”. But pragmatically, it often seems to make sense to regard as sophisticated-implementation-capacity of the larger agent something that implicitly includes some local planning capacity and situational awareness, and may be provided by an agent itself.

  4. ^

     Some situational awareness is about where the (parts of the) agent itself can be found. This information should be easily provided in separable form. Because of safety considerations, people are sometimes interested in whether systems will spontaneously develop this type of situational awareness, even if it’s not explicitly given to them (or even if it’s explicitly withheld).

  5. ^

     One might worry that written goals would necessarily have the undesirable feature that, by being written down, they would be forever ossified. But it seems like that should be avoidable, just by having content in the goals which provides for their own replacement. Just as, in giving instructions to a human subordinate, one can tell them when to come back and ask more questions, so too a written goal specification could include instructions on circumstances in which to consult something beyond the document (perhaps the agentic system which produced the document).



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 智能体 分解 重构 未来
相关文章