钛媒体:引领未来商业与生活新知 15小时前
China Eyes Breakthroughs in AI Chip Power as Global Arms Race Accelerates
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

随着人工智能浪潮席卷各行业,全球计算能力竞争进入决定性阶段,可能重塑数十年技术领导地位。英伟达市值超4万亿美元,黄仁勋财富超巴菲特,AI繁荣重塑资本市场和地缘科技优先事项。英伟达成功体现了基于GPU的AI计算架构主导地位,而中国正努力缩小差距。在芯片供应和先进制造限制加剧下,中国AI行业面临供应侧约束和由大型模型开发者如DeepSeek驱动的需求激增。预计到2030年,中国AI芯片市场将超1800亿美元,而广义AI相关经济可能达1.4万亿美元。但国内生产仍落后且对外依赖度高,北京正积极推动本土创新,不仅限于硅基,还包括架构和设计方法。清华教授尹寿怡在集成电路设计创新大会上提出,芯片创新必须超越摩尔定律,下一前沿在于系统技术协同优化(STCO),该方法聚焦于整个设计制造链的性能、功耗、面积和成本(PPAC)优化。STCO将芯片架构、制造工艺和封装深度整合为协同设计周期,帮助中国芯片制造商绕过全球代工限制,构建专为大规模时空工作负载的AI专用芯片。尹寿怡提出四大重点领域:架构探索、组件设计、快速仿真和工艺协同优化,强调不仅是速度,更是进化整个芯片设计生态系统以满足下一代AI需求。预计中国AI计算需求将呈指数级增长,到2029年AI加速芯片收入将达1834亿美元,年复合增长率53.7%,总计算能力将增近六倍。但中国仍严重依赖GPU,占所有AI芯片近70%。可配置处理单元(RPU)等替代方案正在涌现,如SambaNova和Groq,Groq称其芯片推理速度是英伟达H100的10倍而成本仅十分之一,特斯拉Dojo超级计算机采用类似分布式架构,谷歌TPU v7“铁木”性能提升3600倍,竞争加剧。清华衍生企业清威智能推出TX81 RPU模块,支持万亿参数模型,其服务器单节点达4PFLOPS,部署于多省智能计算中心服务DeepSeek等国内模型,目标是无依赖供应链的自立基础设施。无论GPU、RPU还是TPU,AI未来取决于可扩展高效的计算基础设施,正如英伟达CEO黄仁勋所言,AI正成为如电力或互联网般基础,新一代工作负载要求前所未有的计算水平,数据中心正快速演变为数字未来的核心计算单元。

🔬中国AI芯片市场预计到2030年将超过1800亿美元,而广义AI相关经济可能达1.4万亿美元,但国内生产仍落后且对外依赖度高,北京正积极推动本土创新,不仅限于硅基,还包括架构和设计方法。

🚀STCO(系统技术协同优化)方法聚焦于整个设计制造链的性能、功耗、面积和成本(PPAC)优化,将芯片架构、制造工艺和封装深度整合为协同设计周期,帮助中国芯片制造商绕过全球代工限制。

🧠中国AI计算需求将呈指数级增长,到2029年AI加速芯片收入将达1834亿美元,年复合增长率53.7%,总计算能力将增近六倍,但中国仍严重依赖GPU,占所有AI芯片近70%。

💻可配置处理单元(RPU)等替代方案正在涌现,如Groq称其芯片推理速度是英伟达H100的10倍而成本仅十分之一,特斯拉Dojo超级计算机采用类似分布式架构,谷歌TPU v7“铁木”性能提升3600倍。

🏭清华衍生企业清威智能推出TX81 RPU模块,支持万亿参数模型,其服务器单节点达4PFLOPS,部署于多省智能计算中心服务DeepSeek等国内模型,目标是无依赖供应链的自立基础设施。

AsianFin -- As a new wave of artificial intelligence sweeps across industries, the global race for computing power is entering a decisive phase—one that could redefine technology leadership for decades.

With Nvidia’s market capitalization surpassing $4 trillion in early July—eclipsing Apple and Microsoft—and CEO Jensen Huang overtaking Warren Buffett in personal wealth, the AI boom is reshaping capital markets and geopolitical tech priorities.

While Nvidia’s success epitomizes the dominance of GPU-based AI compute architecture, China is scrambling to close a widening gap. Amid intensifying restrictions on chip supply and advanced manufacturing, China’s AI sector is battling both supply-side constraints and a demand surge driven by large model developers like DeepSeek.

According to industry forecasts, China’s AI chip market is projected to exceed $180 billion by 2030, while the broader AI-related economy could top $1.4 trillion. But with domestic production still trailing and foreign dependency high, Beijing is aggressively pursuing indigenous innovation—not only in silicon but also in architecture and design methodologies.

At the China Integrated Circuit Design Innovation Conference (ICDIA) in early July, Tsinghua University professor and Qingwei Intelligence co-founder Yin Shouyi presented a stark assessment: chip innovation must move beyond Moore’s Law. As transistor miniaturization nears physical limits, the next frontier lies in System-Technology Co-Optimization (STCO)—a methodology focused on optimizing performance, power, area, and cost (PPAC) across the entire design-manufacture stack.

STCO integrates chip architecture, manufacturing processes, and packaging into a deeply collaborative design cycle. This strategy could help Chinese chipmakers bypass bottlenecks imposed by global foundry restrictions and build AI-specific chips tailored for massive-scale, spatiotemporal workloads.

“China needs to move fast and think holistically,” said Yin, outlining four key focus areas: architecture exploration, component design, rapid simulation, and process co-optimization. “It’s not just about speed—it’s about evolving the entire chip design ecosystem to meet the demands of next-gen AI.”

AI compute demand in China is expected to grow exponentially. According to Frost & Sullivan, AI acceleration chip revenue will rise from $19.6 billion in 2024 to $183.4 billion by 2029, representing a compound annual growth rate (CAGR) of 53.7%. Over the same period, China’s total compute capacity is projected to increase nearly sixfold, from 617 EFLOPS to over 3,440 EFLOPS.

However, the country remains heavily reliant on GPUs—primarily from Nvidia—which currently account for nearly 70% of all AI chips in use. But alternatives are emerging.

One such alternative is the Reconfigurable Processing Unit (RPU), a chip architecture based on distributed dataflow computing. Unlike traditional, instruction-driven GPUs, RPUs dynamically allocate compute resources, enabling higher throughput and energy efficiency tailored for AI inference and training.

Companies like SambaNova and Groq are leading the global charge. Groq claims its chips offer 10× the inference speed of Nvidia’s H100 at one-tenth the cost, while Tesla has adopted similar distributed architecture in its Dojo supercomputer. Google’s newly launched TPU v7 “Ironwood” boasts a staggering 3,600× performance gain, further intensifying the competition.

At the center of China’s RPU development is Qingwei Intelligence, a spinout from Tsinghua’s Reconfigurable Computing Lab. The firm has launched the TX8 series and the latest TX81 RPU module, capable of delivering 512 TFLOPS (FP16). Its REX1032 server, designed for trillion-parameter models, reaches 4 PFLOPS per node and supports direct multi-card interconnects, eliminating costly switch hardware.

Qingwei’s chips are now deployed at intelligent computing centers across several Chinese provinces, serving leading domestic models like DeepSeek R1 and V3. The goal: build a self-reliant infrastructure that rivals U.S. giants without leaning on foreign supply chains.

Whether it’s GPUs, RPUs, or TPUs, one reality is clear: AI’s future depends on scalable, efficient compute infrastructure. As Nvidia CEO Jensen Huang has stated, AI is becoming as foundational as electricity or the internet.

With next-generation workloads—like Agentic AI and Physical AI—demanding unprecedented levels of compute, data centers are quickly evolving into AI factories, the core computational units of the digital future.

For China, catching up in the AI race will require more than scale—it will demand strategic shifts in chip architecture, cross-disciplinary innovation, and full-stack ecosystem development.

更多精彩内容,关注钛媒体微信号(ID:taimeiti),或者下载钛媒体App

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI芯片 英伟达 中国AI STCO RPU
相关文章