AlphaGo Moment for Model Architecture Discovery (arXiv)

少点错误 07月27日 05:34

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

一篇新论文ASI-ARCH引发了AI研究领域的广泛关注。该系统被描述为首个用于AI研究的“人工智能超级智能”（ASI4AI），专门从事神经网络架构的发现。ASI-ARCH旨在打破人类认知能力对AI研究速度的线性限制，通过自主假设、实现、训练和验证新架构，实现从自动化优化到自动化创新的范式转变。该系统在20,000 GPU小时内进行了1,773次自主实验，发现了106种创新的、达到SOTA（state-of-the-art）的线性注意力架构，其设计原则超越了人类设计的基线。论文还提出了科学发现的经验尺度定律，表明架构突破可以计算扩展，使研究进展成为一个计算可扩展而非人类限制的过程。

🚀 ASI-ARCH的突破性创新在于其首创的“人工智能超级智能”（ASI4AI）能力，专注于神经网络架构的发现，旨在解决AI研究因人类认知能力而产生的瓶颈。它从根本上改变了研究模式，从“自动化优化”转变为“自动化创新”，使得AI能够自主地进行前沿的科学研究。

💡 ASI-ARCH通过一个端到端的自主研究流程，能够独立地提出新颖的架构概念，将其转化为可执行代码，并通过大规模的实验进行训练和性能验证。在此过程中，它借鉴了人类和AI的历史经验，展现了AI在科学发现中的强大自主性。

📈 该系统在20,000 GPU小时内完成了1,773次自主实验，成功发现了106种创新的、达到最先进水平（SOTA）的线性注意力架构。这些AI发现的架构展现了超越人类设计基线的“涌现设计原则”，揭示了前所未有的架构创新路径。

📊 论文的核心贡献之一是建立了“科学发现的经验尺度定律”，证明了架构突破可以通过计算能力进行扩展，从而将研究进展从受限于人类认知转变为一个可计算扩展的过程。这预示着AI驱动的研究将进入一个自我加速的时代。

🌐 为了推动AI驱动的研究的普及，研究团队公开了ASI-ARCH的完整框架、发现的架构以及研究过程中的“认知痕迹”，鼓励更广泛的社区参与和发展。

Published on July 26, 2025 9:31 PM GMT

A new paper picking up steam on twitter/X AI discourse, mostly thanks to its absurdly boastful title and abstract. I'm trying to figure out how important the paper is and whether the methodology/results are sound, but it's hard to find good analysis through all the noise.

While AI systems demonstrate exponentially improving capabilities, the pace of AI research itself remains linearly bounded by human cognitive capacity, creating an increasingly severe development bottleneck. We present ASI-ARCH, the first demonstration of Artificial Superintelligence for AI research (ASI4AI) in the critical domain of neural architecture discovery—a fully autonomous system that shatters this fundamental constraint by enabling AI to conduct its own architectural innovation. Moving beyond traditional Neural Architecture Search (NAS), which is fundamentally limited to exploring human-defined spaces, we introduce a paradigm shift from automated optimization to automated innovation. ASI-ARCH can conduct end-to-end scientific research in the challenging domain of architecture discovery, autonomously hypothesizing novel architectural concepts, implementing them as executable code, training and empirically validating their performance through rigorous experimentation and past human and AI experience. ASI-ARCH conducted 1,773 autonomous experiments over 20,000 GPU hours, culminating in the discovery of 106 innovative, state-of-the-art (SOTA) linear attention architectures. Like AlphaGo’s Move 37 that revealed unexpected strategic insights invisible to human players, our AI-discovered architectures demonstrate emergent design principles that systematically surpass human-designed baselines and illuminate previously unknown pathways for architectural innovation (Fig. 2). Crucially, we establish the first empirical scaling law for scientific discovery itself—demonstrating that architectural breakthroughs can be scaled computationally, transforming research progress from a human-limited to a computation-scalable process. We provide comprehensive analysis of the emergent design patterns and autonomous research capabilities that enabled these breakthroughs, establishing a blueprint for self-accelerating AI systems. To democratize AI-driven research, we open-source the complete framework, discovered architectures, and cognitive traces.

Discuss

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签