MIT Technology Review » Artificial Intelligence 05月30日 22:43
Fueling seamless AI at scale
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了人工智能(AI)发展对算力提出的新需求,以及硬件、算法和生态系统如何协同发展以应对这些挑战。随着AI模型日益庞大和应用场景不断扩展,传统计算架构面临瓶颈。文章强调了硬件创新(如新型芯片和计算架构)、算法优化(如小样本学习和模型压缩)以及生态系统建设(如行业标准和开放平台)的重要性。通过这些努力,AI有望成为像电力或半导体一样的通用技术,为个人和企业带来无缝体验。

⚙️ 硬件创新是AI发展的关键驱动力。传统硅芯片面临摩尔定律瓶颈,需要探索新的计算架构,如光子计算和量子计算,以满足日益增长的算力需求。

🧠 算法优化提升AI效率。从单体模型到基于代理的系统转变,以及小样本学习和模型量化等技术,使得AI系统能够在更少的计算资源下实现更高的性能。

🌐 生态系统建设推动AI普及。统一的标准、框架和平台,以及开放源代码的贡献,有助于促进AI技术的广泛应用和创新,确保AI的益处惠及所有人。

From large language models (LLMs) to reasoning agents, today’s AI tools bring unprecedented computational demands. Trillion-parameter models, workloads running on-device, and swarms of agents collaborating to complete tasks all require a new paradigm of computing to become truly seamless and ubiquitous.

First, technical progress in hardware and silicon design is critical to pushing the boundaries of compute. Second, advances in machine learning (ML) allow AI systems to achieve increased efficiency with smaller computational demands. Finally, the integration, orchestration, and adoption of AI into applications, devices, and systems is crucial to delivering tangible impact and value.

Silicon’s mid-life crisis

AI has evolved from classical ML to deep learning to generative AI. The most recent chapter, which took AI mainstream, hinges on two phases—training and inference—that are data and energy-intensive in terms of computation, data movement, and cooling. At the same time, Moore’s Law, which determines that the number of transistors on a chip doubles every two years, is reaching a physical and economic plateau.

For the last 40 years, silicon chips and digital technology have nudged each other forward—every step ahead in processing capability frees the imagination of innovators to envision new products, which require yet more power to run. That is happening at light speed in the AI age.

As models become more readily available, deployment at scale puts the spotlight on inference and the application of trained models for everyday use cases. This transition requires the appropriate hardware to handle inference tasks efficiently. Central processing units (CPUs) have managed general computing tasks for decades, but the broad adoption of ML introduced computational demands that stretched the capabilities of traditional CPUs. This has led to the adoption of graphics processing units (GPUs) and other accelerator chips for training complex neural networks, due to their parallel execution capabilities and high memory bandwidth that allow large-scale mathematical operations to be processed efficiently.

But CPUs are already the most widely deployed and can be companions to processors like GPUs and tensor processing units (TPUs). AI developers are also hesitant to adapt software to fit specialized or bespoke hardware, and they favor the consistency and ubiquity of CPUs. Chip designers are unlocking performance gains through optimized software tooling, adding novel processing features and data types specifically to serve ML workloads, integrating specialized units and accelerators, and advancing silicon chip innovations, including custom silicon. AI itself is a helpful aid for chip design, creating a positive feedback loop in which AI helps optimize the chips that it needs to run. These enhancements and strong software support mean modern CPUs are a good choice to handle a range of inference tasks.

Beyond silicon-based processors, disruptive technologies are emerging to address growing AI compute and data demands. The unicorn start-up Lightmatter, for instance, introduced photonic computing solutions that use light for data transmission to generate significant improvements in speed and energy efficiency. Quantum computing represents another promising area in AI hardware. While still years or even decades away, the integration of quantum computing with AI could further transform fields like drug discovery and genomics.

Understanding models and paradigms

The developments in ML theories and network architectures have significantly enhanced the efficiency and capabilities of AI models. Today, the industry is moving from monolithic models to agent-based systems characterized by smaller, specialized models that work together to complete tasks more efficiently at the edge—on devices like smartphones or modern vehicles. This allows them to extract increased performance gains, like faster model response times, from the same or even less compute.

Researchers have developed techniques, including few-shot learning, to train AI models using smaller datasets and fewer training iterations. AI systems can learn new tasks from a limited number of examples to reduce dependency on large datasets and lower energy demands. Optimization techniques like quantization, which lower the memory requirements by selectively reducing precision, are helping reduce model sizes without sacrificing performance. 

New system architectures, like retrieval-augmented generation (RAG), have streamlined data access during both training and inference to reduce computational costs and overhead. The DeepSeek R1, an open source LLM, is a compelling example of how more output can be extracted using the same hardware. By applying reinforcement learning techniques in novel ways, R1 has achieved advanced reasoning capabilities while using far fewer computational resources in some contexts.

The integration of heterogeneous computing architectures, which combine various processing units like CPUs, GPUs, and specialized accelerators, has further optimized AI model performance. This approach allows for the efficient distribution of workloads across different hardware components to optimize computational throughput and energy efficiency based on the use case.

Orchestrating AI

As AI becomes an ambient capability humming in the background of many tasks and workflows, agents are taking charge and making decisions in real-world scenarios. These range from customer support to edge use cases, where multiple agents coordinate and handle localized tasks across devices.

With AI increasingly used in daily life, the role of user experiences becomes critical for mass adoption. Features like predictive text in touch keyboards, and adaptive gearboxes in vehicles, offer glimpses of AI as a vital enabler to improve technology interactions for users.

Edge processing is also accelerating the diffusion of AI into everyday applications, bringing computational capabilities closer to the source of data generation. Smart cameras, autonomous vehicles, and wearable technology now process information locally to reduce latency and improve efficiency. Advances in CPU design and energy-efficient chips have made it feasible to perform complex AI tasks on devices with limited power resources. This shift toward heterogeneous compute enhances the development of ambient intelligence, where interconnected devices create responsive environments that adapt to user needs.

Seamless AI naturally requires common standards, frameworks, and platforms to bring the industry together. Contemporary AI brings new risks. For instance, by adding more complex software and personalized experiences to consumer devices, it expands the attack surface for hackers, requiring stronger security at both the software and silicon levels, including cryptographic safeguards and transforming the trust model of compute environments.

More than 70% of respondents to a 2024 DarkTrace survey reported that AI-powered cyber threats significantly impact their organizations, while 60% say their organizations are not adequately prepared to defend against AI-powered attacks.

Collaboration is essential to forging common frameworks. Universities contribute foundational research, companies apply findings to develop practical solutions, and governments establish policies for ethical and responsible deployment. Organizations like Anthropic are setting industry standards by introducing frameworks, such as the Model Context Protocol, to unify the way developers connect AI systems with data. Arm is another leader in driving standards-based and open source initiatives, including ecosystem development to accelerate and harmonize the chiplet market, where chips are stacked together through common frameworks and standards. Arm also helps optimize open source AI frameworks and models for inference on the Arm compute platform, without needing customized tuning. 

How far AI goes to becoming a general-purpose technology, like electricity or semiconductors, is being shaped by technical decisions taken today. Hardware-agnostic platforms, standards-based approaches, and continued incremental improvements to critical workhorses like CPUs, all help deliver the promise of AI as a seamless and silent capability for individuals and businesses alike. Open source contributions are also helpful in allowing a broader range of stakeholders to participate in AI advances. By sharing tools and knowledge, the community can cultivate innovation and help ensure that the benefits of AI are accessible to everyone, everywhere.

Learn more about Arm’s approach to enabling AI everywhere.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

This content was researched, designed, and written entirely by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 算力 算法 生态系统
相关文章