EnterpriseAI 2024年12月05日
AWS Delivers the AI Heat: Project Rainier and GenAI Innovations Lead the Way
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

亚马逊AWS在2024年re:Invent大会上发布了一系列重磅AI举措,包括与Anthropic合作打造全球最大AI超级计算机之一‘Rainier项目’、推出Nova系列基础模型以及Trainium2 AI芯片,旨在降低生成式AI成本,提升性能。亚马逊还推出了Nova系列基础模型,涵盖轻量级文本模型到多模态模型,并强调了其成本效益和速度优势。此外,Trainium2芯片现已全面可用,性能提升显著,并即将推出新一代Trainium3芯片。这些举措显示了亚马逊在AI领域的雄心,以及其通过自主研发和合作的方式推动AI产业发展战略。

🤔 **亚马逊AWS宣布与Anthropic合作打造‘Rainier项目’,一个由Trainium2芯片驱动的‘Ultracluster’超级计算机。** 该项目预计于2025年完成,将包含数十万个Trainium2芯片,训练AI模型的算力将提升5倍以上,成为Nvidia GPU在AI芯片领域的强劲竞争对手。

💡 **亚马逊推出Nova系列基础模型,旨在提供更具成本效益和速度优势的生成式AI解决方案。** Nova系列涵盖多种模型,包括轻量级文本模型、多模态模型以及图像和视频生成模型,并集成在Amazon Bedrock平台中。

🚀 **Trainium2芯片现已全面可用,并提供两种新的云服务:Amazon EC2 Trn2实例和Trn2 UltraServers。** 与当前一代GPU相比,Trainium2芯片在性价比方面提升了30%-40%,并为训练和部署数十亿参数的LLM提供了强大的算力支持。

💻 **亚马逊正在研发下一代AI芯片Trainium3,计划于明年推出。** Trainium3的性能将是Trainium2的两倍,能效也将提升40%,进一步推动大型AI模型的开发和部署。

🤝 **亚马逊与苹果等公司合作,推动Trainium芯片的应用。** 苹果计划将Trainium芯片集成到其AI技术平台Apple Intelligence中,表明Trainium芯片在行业内获得了认可。

At AWS re:Invent 2024 in Las Vegas, Amazon unveiled a series of transformative AI initiatives, including the development of one of the world's largest AI supercomputers in partnership with Anthropic, the introduction of the Nova series of AI foundation models, and the availability of the Trainium2 AI chip, positioning itself as a formidable competitor in the artificial intelligence landscape. 

Amazon CEO Andy Jassy emphasized the critical role of cost efficiency in generative AI development, highlighting the industry's growing demand for alternative AI infrastructure solutions that deliver better price performance. 

“One of the big lessons that we've learned from having about 1000 generative AI applications that we're either in the process of building or have launched at Amazon, is that the cost of compute in these generative AI applications really matters, and is often the difference maker of whether you can do it or you can't,” Jassy said in a recap video. “And to date, all of us have used just one chip in the compute for generative AI. And people are hungry for better price performance.” 

Project Rainier

AWS announced Project Rainier, a groundbreaking "Ultracluster" supercomputer powered by its Trainium chips. This massive cluster will contain hundreds of thousands of Trainium2 chips, delivering more than five times the exaflops used to train Anthropic's current generation of AI models. 

AWS Trainium2 AI chip. (Source: AWS)

AWS Trainium chips are positioned as a direct competitor to the Nvidia GPUs currently dominating the market. Project Rainier, set to be completed in 2025, could potentially set new records for size and performance. 

The announcement has already excited investors, with Amazon’s stock price rising more than 1% to nearly $213 following the news. A key partner in this venture is AI startup Anthropic, valued at $18 billion. AWS has invested $8 billion in the company, and Anthropic plans to leverage Project Rainier to train its AI models. The two firms are also working together to enhance the capabilities of Amazon’s Trainium chips, signaling a deep integration of R&D efforts. 

At the same time, AWS is advancing Project Ceiba, another supercomputer initiative developed in collaboration with Nvidia. Project Ceiba will feature over 20,000 Nvidia Blackwell GPUs, emphasizing AWS's strategy to diversify its AI infrastructure offerings. While Rainier focuses on Trainium chip adoption, Ceiba highlights AWS's ability to work with other industry leaders to support diverse AI workloads. 

Amazon Nova, A New Generation of Foundation Models

The company introduced its Nova family of foundation models, spanning from lightweight text-only models to larger and more advanced language models, as well as models designed to generate images and videos. 

The new Nova models will be available in Amazon Bedrock, the company’s platform for building generative AI apps. 

The new models include: 

“Our new Amazon Nova models are intended to help with these challenges for internal and external builders, and provide compelling intelligence and content generation while also delivering meaningful progress on latency, cost-effectiveness, customization, retrieval augmented generation (RAG), and agentic capabilities,” said Rohit Prasad, SVP of Amazon Artificial General Intelligence. 

Jassy says the company has made “tremendous” progress on its new frontier models, noting how “they benchmark very competitively” and are cost-effective and fast: “They're 75% less expensive than the other leading models in Bedrock. They are laser fast. They're the fastest models you're going to find there,” he said. “Nova models allow you to do fine tuning, and increasingly, our application builders for generative AI want to fine-tune the models with their own label data and examples. It allows you to do model distillation, which means taking a big model and infusing that intelligence in a smaller model, so that you get lower latency and lower cost.” 

Addressing the fight against hallucinations and inaccuracy, AWS says Amazon Nova models are integrated with Amazon Bedrock Knowledge Bases and excel at Retrieval Augmented Generation (RAG), enabling customers to ensure the best accuracy by grounding responses in an organization’s own data. 

Trainium Gets an Upgrade

Powering these exciting developments are AWS’s Trainium2 chips, now available through two new cloud services. The company announced the general availability of AWS Trainium2-powered Amazon Elastic Compute Cloud (Amazon EC2) instances, as well as new Trn2 UltraServers. 

Amazon EC2 Trn2 UltraServers. (Source: AWS)

The company says these instances deliver 30–40% better price performance compared to the current generation of GPU-based EC2 P5e and P5en instances. Equipped with 16 Trainium2 chips, Trn2 instances offer 20.8 peak petaflops of compute, making them ready for training and deploying billion-parameter LLMs. 

The new EC2 Trn2 UltraServers feature 64 interconnected Trainium2 chips connected via the NeuronLink interconnect. With up to 83.2 peak petaflops of compute, the UltraServers quadruple the compute, memory, and networking of a single instance. 

Looking ahead, AWS unveiled its next-generation AI chip, Trainium3. This chip is designed to accelerate the development of even larger models and enhance real-time performance during deployment. Trainium3 will be available next year and will be up to twice as fast as the existing Trainium2 while being 40% more energy-efficient, AWS CEO Matt Garman revealed during his keynote on Tuesday.

The growing adoption of Trainium chips by major players, including Apple, adds to the company’s momentum. Benoit Dupin, Apple’s senior director of machine learning and AI, revealed plans to incorporate Trainium into Apple Intelligence, Apple’s AI technology platform. 

These latest developments underscore AWS's dual approach to its AI plans: innovating through proprietary technologies like Trainium while partnering with established players like Nvidia to provide comprehensive AI offerings. As AWS continues to expand its influence in AI computing, its investments and collaborations look to be setting the stage for significant industry disruption.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

亚马逊AWS AI超级计算机 Trainium芯片 Nova基础模型 生成式AI
相关文章