钛媒体:引领未来商业与生活新知 03月24日 15:36
Jack Ma-Backed Ant Group Achieves AI Breakthrough Using Chinese-Made Chips
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

蚂蚁集团利用国产芯片,研发出AI训练技术,降低了20%的成本。通过混合专家(MoE)机器学习方法,蚂蚁集团借助阿里巴巴和华为的芯片,实现了与英伟达H800 GPU相当的性能。此举标志着中国企业在AI模型开发领域,与美国公司竞争加剧。蚂蚁集团的创新,突显了中国在AI领域减少对英伟达芯片依赖的努力。蚂蚁集团计划将AI技术应用于医疗保健和金融等领域,并开源其Ling模型,以推动AI技术的进一步发展。

💡蚂蚁集团通过使用国产芯片,尤其是来自阿里巴巴和华为的芯片,开发了AI训练技术,从而降低了高达20%的成本。

⚙️蚂蚁集团采用了混合专家(MoE)机器学习方法,该方法被Google和DeepSeek等AI领导者采用,将任务分解成更小的、专门化的部分,从而提高处理效率。

💰蚂蚁集团的研究表明,使用传统高性能硬件训练1万亿个token的成本约为635万元人民币(88万美元),而通过其优化方法,由于能够在性能较低的硬件上训练模型,成本可降至510万元人民币。

🚀蚂蚁集团计划将其AI突破应用于工业领域,特别是在医疗保健和金融领域,其Ling-Plus和Ling-Lite模型预计将发挥关键作用。

🌐蚂蚁集团已开源其Ling模型,允许全球研究人员和开发者探索其创新成果,Ling-Lite模型拥有168亿个参数,而Ling-Plus则拥有2900亿个参数。

(Image credit: Photo by editor Lin Zhijia)

AsianFin -- Ant Group, an affiliate company of Chinese conglomerate Alibaba Group, has developed AI training techniques using Chinese-made semiconductors that could reduce costs by 20%.

The fintech giant leveraged domestic chips from affiliates such as Alibaba Group Holding Ltd. and Huawei Technologies Co. to train models through the Mixture of Experts (MoE) machine learning approach, the sources said.

These models reportedly achieved performance levels comparable to those trained on Nvidia's H800 GPUs. While Ant continues to use Nvidia chips for AI development, it has increasingly turned to alternatives, including Advanced Micro Devices Inc. and other Chinese semiconductor providers, one source added.

Ant's latest development signals its entry into the intensifying competition between Chinese and U.S. firms to develop cutting-edge AI models. This race has gained momentum after DeepSeek demonstrated that highly capable AI models can be trained at a fraction of the cost spent by OpenAI and Alphabet Inc.'s Google.

The shift also underscores how Chinese companies are working to reduce reliance on Nvidia's advanced semiconductors, which are subject to U.S. export restrictions. While the H800 is not the most advanced Nvidia GPU, it remains one of the most powerful AI chips currently banned from sale to China.

This month, Ant published a research paper claiming that its AI models outperformed Meta Platforms Inc. by certain benchmarks. While these claims have not be independently verified, such advancements could be significant if confirmed. If Ant's technology performs as advertised, it could enhance China's AI development by lowering the cost of inference and supporting a wider range of AI applications.

The MoE machine learning technique, which Ant has adopted, has gained traction among AI leaders such as Google and DeepSeek. This method breaks down tasks into smaller specialized segments, much like a team of experts each handling different parts of a job, leading to more efficient processing.

However, training MoE models typically requires high-performance GPUs, such as those from Nvidia. The prohibitive cost of these chips has restricted broader adoption, particularly among smaller AI firms. Ant has been actively working to make large language model (LLM) training more efficient, aiming to reduce dependence on premium GPUs. The company's research paper explicitly states its goal of scaling AI training without relying on high-end Nvidia chips.

This approach contradicts Nvidia's long-term vision. Jensen Huang, the CEO of Nvidia, has argued that AI computation demand will continue to grow, even with more efficient models like DeepSeek's R1. He maintains that companies will prioritize better chips to generate revenue, rather than cheaper chips to cut costs. As a result, Nvidia has continued its strategy of building increasingly powerful GPUs with higher processing power, more transistors, and greater memory capacity.

Ant Group's research highlights the rapid innovation within China's AI industry and suggests that the nation is making strides toward AI self-sufficiency. By adopting cost-efficient, computationally optimized AI models, China is actively working around U.S. restrictions on advanced Nvidia chips.

According to Ant's estimates, training 1 trillion tokens—a fundamental unit of data used in AI learning—currently costs around 6.35 million yuan ($880,000) using conventional high-performance hardware. However, Ant's optimized approach would lower this cost to 5.1 million yuan, thanks to its ability to train models on less powerful hardware.

The company plans to apply its AI breakthroughs to industrial applications, particularly in healthcare and finance, sources said. Its Ling-Plus and Ling-Lite models, developed as part of this initiative, are expected to play a key role in these sectors.

Earlier this year, Ant acquired Chinese online platform Haodf.com, strengthening its AI-driven healthcare services. The company has also launched AI-powered applications, including Zhixiaobao, an AI-based life assistant, and Maxiaocai, a financial advisory AI service.

Ant's research suggests that its Ling-Lite model outperforms a Meta Llama model in English-language understanding benchmarks. Additionally, both Ling-Lite and Ling-Plus surpassed DeepSeek's equivalent models in Chinese-language tasks, demonstrating China's growing AI capabilities.

Robin Yu, Chief Technology Officer at Beijing-based Shengshang Tech Co., compared AI competition to martial arts, saying "If you find one point of attack to beat the world's best kung fu master, you can still say you beat them. That's why real-world application matters."

Ant has open-sourced its Ling models, allowing researchers and developers worldwide to explore its innovations. The Ling-Lite model has 16.8 billion parameters, while Ling-Plus features 290 billion parameters. These parameters serve as the adjustable settings that fine-tune an AI model's performance.

For comparison, industry experts estimate that ChatGPT's GPT-4.5—which has not been officially disclosed—likely contains around 1.8 trillion parameters, according to MIT Technology Review. DeepSeek-R1, another Chinese model, boasts 671 billion parameters.

Despite its achievements, Ant faced difficulties during training, particularly in ensuring model stability. The company noted that even minor changes in hardware or the model's architecture caused instability, leading to spikes in the error rate. These challenges highlight the complexity of building AI models without relying on the industry's most advanced chips.

As China accelerates its push toward AI self-reliance, Ant Group's latest advancements reflect the country's determination to innovate despite export restrictions. If successful, these breakthroughs could reduce China’s dependence on foreign semiconductors—its key goal in the ongoing U.S.-China technology rivalry.

更多精彩内容,关注钛媒体微信号(ID:taimeiti),或者下载钛媒体App

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

蚂蚁集团 国产芯片 AI训练 MoE模型
相关文章