AI News 04月03日 18:06
Ant Group uses domestic chips to train AI models and cut costs
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

蚂蚁集团正积极采用中国制造的半导体芯片来训练人工智能模型,以降低成本并减少对受限的美国技术的依赖。该公司利用来自国内供应商的芯片,包括阿里巴巴和华为,通过Mixture of Experts (MoE) 方法训练大型语言模型。据称,其结果与使用英伟达H800芯片的效果相当。此举表明蚂蚁集团更深入地参与中美科技公司之间日益增长的AI竞赛,并反映了中国企业绕过出口限制,寻找经济高效的AI模型训练方法。蚂蚁集团计划将其模型应用于医疗保健和金融等行业,并已将模型开源。

💡 蚂蚁集团正在使用中国制造的半导体芯片训练人工智能模型。这些芯片来自国内供应商,包括阿里巴巴和华为,用于Mixture of Experts (MoE) 方法训练大型语言模型,旨在降低成本并减少对美国技术的依赖。

💻 蚂蚁集团使用国产芯片训练AI模型的成果与英伟达H800芯片相当。虽然蚂蚁集团仍在使用英伟达芯片进行部分AI开发,但正越来越多地转向AMD和中国芯片制造商的替代方案,用于其最新模型。

💰 通过使用低规格芯片,蚂蚁集团优化了训练方法,将训练1万亿个tokens(AI模型用于学习的基本数据单元)的成本从约635万元人民币(约88万美元)降低到约510万元人民币。

🚀 蚂蚁集团计划将其通过这种方式产生的模型(Ling-Plus和Ling-Lite)应用于医疗保健和金融等工业AI用例。此前,该公司收购了中国在线医疗平台Haodf.com,以进一步部署基于AI的医疗保健解决方案。

🔓 蚂蚁集团已将其模型开源。Ling-Lite拥有168亿个参数,而Ling-Plus拥有2900亿个参数。相比之下,据估计,闭源的GPT-4.5大约有1.8万亿个参数。

Ant Group is relying on Chinese-made semiconductors to train artificial intelligence models to reduce costs and lessen dependence on restricted US technology, according to people familiar with the matter.

The Alibaba-owned company has used chips from domestic suppliers, including those tied to its parent, Alibaba, and Huawei Technologies to train large language models using the Mixture of Experts (MoE) method. The results were reportedly comparable to those produced with Nvidia’s H800 chips, sources claim. While Ant continues to use Nvidia chips for some of its AI development, one sources said the company is turning increasingly to alternatives from AMD and Chinese chip-makers for its latest models.

The development signals Ant’s deeper involvement in the growing AI race between Chinese and US tech firms, particularly as companies look for cost-effective ways to train models. The experimentation with domestic hardware reflects a broader effort among Chinese firms to work around export restrictions that block access to high-end chips like Nvidia’s H800, which, although not the most advanced, is still one of the more powerful GPUs available to Chinese organisations.

Ant has published a research paper describing its work, stating that its models, in some tests, performed better than those developed by Meta. Bloomberg News, which initially reported the matter, has not verified the company’s results independently. If the models perform as claimed, Ant’s efforts may represent a step forward in China’s attempt to lower the cost of running AI applications and reduce the reliance on foreign hardware.

MoE models divide tasks into smaller data sets handled by separate components, and have gained attention among AI researchers and data scientists. The technique has been used by Google and the Hangzhou-based startup, DeepSeek. The MoE concept is similar to having a team of specialists, each handling part of a task to make the process of producing models more efficient. Ant has declined to comment on its work with respect to its hardware sources.

Training MoE models depends on high-performance GPUs which can be too expensive for smaller companies to acquire or use. Ant’s research focused on reducing that cost barrier. The paper’s title is suffixed with a clear objective: Scaling Models “without premium GPUs.” [our quotation marks]

The direction taken by Ant and the use of MoE to reduce training costs contrast with Nvidia’s approach. CEO Officer Jensen Huang has said that demand for computing power will continue to grow, even with the introduction of more efficient models like DeepSeek’s R1. His view is that companies will seek more powerful chips to drive revenue growth, rather than aiming to cut costs with cheaper alternatives. Nvidia’s strategy remains focused on building GPUs with more cores, transistors, and memory.

According to the Ant Group paper, training one trillion tokens – the basic units of data AI models use to learn – cost about 6.35 million yuan (roughly $880,000) using conventional high-performance hardware. The company’s optimised training method reduced that cost to around 5.1 million yuan by using lower-specification chips.

Ant said it plans to apply its models produced in this way – Ling-Plus and Ling-Lite – to industrial AI use cases like healthcare and finance. Earlier this year, the company acquired Haodf.com, a Chinese online medical platform, to further Ant’s ambition to deploy AI-based solutions in healthcare. It also operates other AI services, including a virtual assistant app called Zhixiaobao and a financial advisory platform known as Maxiaocai.

“If you find one point of attack to beat the world’s best kung fu master, you can still say you beat them, which is why real-world application is important,” said Robin Yu, chief technology officer of Beijing-based AI firm, Shengshang Tech.

Ant has made its models open source. Ling-Lite has 16.8 billion parameters – settings that help determine how a model functions – while Ling-Plus has 290 billion. For comparison, estimates suggest closed-source GPT-4.5 has around 1.8 trillion parameters, according to MIT Technology Review.

Despite progress, Ant’s paper noted that training models remains challenging. Small adjustments to hardware or model structure during model training sometimes resulted in unstable performance, including spikes in error rates.

(Photo by Unsplash)

See also: DeepSeek V3-0324 tops non-reasoning AI models in open-source first

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Ant Group uses domestic chips to train AI models and cut costs appeared first on AI News.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

蚂蚁集团 国产芯片 AI模型训练 Mixture of Experts (MoE) 成本控制
相关文章