cs.AI updates on arXiv.org 20小时前
Compute Requirements for Algorithmic Innovation in Frontier AI Models
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文实证研究了算法创新在大型语言模型预训练中的计算需求,分析发现计算限制可能不会显著减缓AI算法的进步。

arXiv:2507.10618v1 Announce Type: cross Abstract: Algorithmic innovation in the pretraining of large language models has driven a massive reduction in the total compute required to reach a given level of capability. In this paper we empirically investigate the compute requirements for developing algorithmic innovations. We catalog 36 pre-training algorithmic innovations used in Llama 3 and DeepSeek-V3. For each innovation we estimate both the total FLOP used in development and the FLOP/s of the hardware utilized. Innovations using significant resources double in their requirements each year. We then use this dataset to investigate the effect of compute caps on innovation. Our analysis suggests that compute caps alone are unlikely to dramatically slow AI algorithmic progress. Even stringent compute caps -- such as capping total operations to the compute used to train GPT-2 or capping hardware capacity to 8 H100 GPUs -- could still have allowed for half of the cataloged innovations.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

算法创新 计算需求 大型语言模型 预训练 计算限制
相关文章