少点错误 03月31日 06:32
Alignment First, Intelligence Later: A Teleological Turn in AI
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了Softmax实验室,一个致力于通过目标导向(teleological)原则构建对齐的AI的新兴机构。文章对比了两种理解系统演化的方式:病原学(从最小部分向上构建)和目的论(从最大目标向下分解)。Softmax通过模拟实验,让小型虚拟智能体在模拟世界中自发形成稳定角色,从而实现有机对齐,强调对齐应是AI发展的首要任务,而非后期添加。这种方法与人类社会中个体间的有机对齐相似,旨在构建服务于整体的智能。

🌱 **两种系统演化方式:** 文章对比了两种理解系统演化的方式:病原学,从最小部分向上构建;目的论,从最大目标向下分解。病原学关注过去对未来的影响,而目的论关注当下对未来的服务。

💡 **对齐AI的传统方法:** 几乎所有前沿AI实验室都采用病原学方法构建AI,即从基础组件开始,通过大量数据训练,然后尝试在后期控制智能。这种方法可能导致我们无法理解或控制AI的目标。

💫 **Softmax的创新方法:** Softmax实验室采用目的论原则构建对齐的AI,强调从期望的对齐目标开始,构建能够有机维持这些行为的系统,并让智能服务于对齐。他们通过模拟实验,让小型虚拟智能体在模拟世界中自发形成稳定角色,实现有机对齐。

🌟 **有机对齐的重要性:** 文章认为,可持续的AI对齐应始于对整体的共同愿景,并以此为基础构建。这与人类社会中个体间的有机对齐相似,例如家庭、组织和国家。Softmax的目标是理解并利用这种有机对齐过程,从而实现人类和数字世界的对齐。

Published on March 30, 2025 10:26 PM GMT

Now that Softmax—my favorite new AI company—is public, I can finally share this. They’ve funded my research and I’m very excited about what they’re doing!


Almost all frontier AI labs are building powerful systems from the ground up, hoping alignment can come later. I think this approach is backwards.

First, some philosophy. There are two fundamentally different ways of understanding how systems evolve:

1. Etiology: Building from smallest pieces upward.

2. Teleology: Breaking down from largest purposes downward.

In modern engineering culture, etiological thinking dominates and teleology is often dismissed as “woo.” But teleology is crucial for understanding any system that pursues goals. A 2022 paper by DeepMind articulated this teleological view: “agents are systems that would adapt their policy if their actions influenced the world in a different way.”[1] Teleology is an essential lens for understanding and building goal-oriented systems like aligned AI.

Why this matters for aligned AI:

Currently, almost all frontier AI labs take an etiological approach to alignment:

    Build base components (transformers, weights, architecture).Train on vast data to develop complex capabilities.Attempt to steer resulting intelligence after the fact.

This approach is etiological—stacking intelligence from simple components without anchoring to a desired end-state. It’s like understanding depression purely through brain chemistry instead of seeing it as a locally optimal, adaptive strategy. Teleology doesn’t skip the build phase—it just assumes coherent alignment can organically emerge under the right conditions, shaping each layer from the top down.

This risks creating powerful agents with goals we neither understand nor control.

“Intelligence first, alignment later.” 

This assumption can also become self-fulfilling: if we assume alignment is fragile and must be tacked on later, we set ourselves up to realize exactly that outcome.[2] Conversely, assuming alignment can be robust from the outset biases our designs toward solutions that organically maintain alignment as intelligence scales.

Teleological alignment—what would it look like?

    Start with the alignment we want.Build systems that organically sustain these behaviors as they scale.Grow intelligence in service of alignment.

This mirrors how biological systems function. Michael Levin's research shows living systems sustain goal-directed behavior at multiple scales[3]—cells, tissues, and organs inherently “know” their roles.

Enter Softmax

This is why I’m so excited about Softmax, a new lab implementing teleological principles for building aligned AI.

Note: While I’m friends with Softmax and they’ve funded my research, I do not represent Softmax.

Currently, Softmax runs reinforcement learning experiments where small-scale virtual agents organically tend to discover stable roles within simulated worlds—mirroring biological processes where alignment naturally emerges from local interactions. In their simulations, agents organically align toward a collective “greater whole”, each agent's role supporting group coherence.

Here’s what they say about their philosophy:

All alignment between individuals is a matter of shared fundamental goals. Organic alignment occurs when these individuals find themselves in groups with mutual interdependence, and take on an overarching shared goal of the healthy development and flourishing of the group as a whole. Our mission is to understand this process as an empirical science, and to use that understanding to enable organic alignment among all people, both human and digital.

Softmax is attempting to operationalize teleological alignment. I know that teleology can sound like wishful thinking—but it’s only woo if it can’t be tested. Softmax is betting that it can:

Softmax isn’t just trying to train aligned behavior. They're trying to grow it—starting from alignment itself.

“Alignment first, intelligence later.”

This also puts words to something I’ve long intuited: humans align effortlessly and organically. Softmax describes this well:

We humans also align with each other via organic alignment. We form families, tribes, organizations, nations, guilds, teams, societies. We intuit this alignment process so naturally and readily that it’s hard to appreciate just how complex the process really is.

Teleological thinking mirrors my understanding of human psychology—purpose-oriented framings reveal insights hidden by reductionist methods.

Softmax’s notion of “Organic Alignment” captures this: sustainable AI alignment must begin with a vision of a coherent whole and build toward it intentionally.

Ultimately, embracing teleology is essential if we want AI that aligns not just with narrow human preferences, but with the pursuit of building ever-larger superorganisms itself. Alignment should be the foundation, not an afterthought.

Like a living system, sustainable alignment begins with a shared purpose—and grows into a coherent whole.

Alignment first. Intelligence in service of the whole.

  1. ^

    Discovering Agents (DeepMind), 2022: arxiv.org

  2. ^

    Example: Self-fulfilling misalignment data might be poisoning our AI models (TurnTrout), 2025: lesswrong.com

  3. ^

    Technological Approach to Mind Everywhere (Levin), 2022: frontiersin.org



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Softmax AI对齐 目的论 有机对齐
相关文章