Communications of the ACM - Artificial Intelligence 05月02日 23:27
Bringing AI to the Edge
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了边缘人工智能(Edge AI)的兴起及其在不同领域的应用和挑战。边缘AI将人工智能模型部署在远离大型云数据中心的设备上,如火车轨道检测系统、智能手机等。文章分析了边缘AI的优势,如低延迟、数据隐私和安全性,以及它面临的挑战,包括计算能力限制和能源消耗。同时,文章也讨论了在边缘-云连续体中,根据不同应用场景,选择合适的计算位置的重要性。最后,文章展望了边缘AI的未来发展,认为其将在未来几年带来重大突破。

🚄 Amtrak将在其繁忙的东北走廊安装Duos Technologies的创新检测系统,该系统利用97个摄像头和LED灯,以高速捕捉火车车厢的图像,通过边缘AI实时分析,快速识别潜在的结构或机械缺陷。

💡 边缘AI的定义涵盖广泛,从电信设备到智能手表和智能家居设备。在边缘运行AI模型面临计算能力、功耗等挑战,但能降低延迟,减少带宽不稳定性,并增强隐私和安全。

⚖️ 边缘-云连续体的选择取决于具体应用。例如,自动驾驶需要低延迟的实时决策,而图片编辑等应用则对延迟的容忍度较高。问题的性质、安全隐私考量,以及经济性也影响着计算位置的选择。

📱 智能手机正在优化以支持边缘AI,新发布的Pixel手机采用了专为AI工作负载设计的Google Tensor G4芯片。较小的、专门构建的模型更适合边缘,例如将文本转换为语音等简单任务。

⚡ 边缘AI的能耗问题引发了对可持续性的关注。虽然边缘AI设备可以减少云端能耗,但设备的制造和废弃也带来了新的环境问题。研究人员正在开发新方法,以平衡性能、能耗、隐私和成本。

This year, U.S. rail carrier Amtrak will be installing two novel inspection gateways from Duos Technologies along its busy Northeast Corridor. The barn-like Duos structures straddle railway tracks; as passenger trains speed through at up to 125 miles per hour, 97 cameras and dozens of LED lights arrayed around the sides, top, and bottom of the tracks will capture thousands of high-resolution images of the railcars. These images are aggregated and processed on site in real time to present a complete, 360-degree, highly detailed view of the train. Artificial intelligence (AI) algorithms running on Nvidia GPUs will analyze the images locally; if the model flags a potential structural or mechanical flaw, train personnel will be notified in less than a minute.

The Duos portal is one of many new examples of what is loosely categorized as edge AI, or the deployment and operation of AI models outside of massive cloud datacenters.

The precise definition of what constitutes an edge varies. “There’s a spectrum, from telecommunications points of presence in major cities to smartwatches, smart home devices, and Meta Ray-Bans,” said Shishir G. Patil, a Ph.D. student in computer science at the University of California, Berkeley. “They all come under this pretty broad category of edge devices.”

Operating AI models at the edge is challenging for a number of reasons. Typically, there is less computational capacity available, relative to the cloud. The power demands of AI models are much larger than traditional applications, which puts tremendous pressure on local hardware, forcing mobile devices to exhaust their battery power faster. Yet the move to the edge also reduces latency and eliminates the risk of inconsistent or unreliable bandwidth because there is no round trip to distant cloud datacenters. Purpose-built edge AI processors, like the on-device versions from Qualcomm or those from Hailo Inc., allow for real-time intelligent decision making, and there are multiple privacy and security benefits to edge processing. These and other factors have everyone from academic computer scientists to technology giants racing to develop more efficient means of pulling AI out of the cloud and closer to users.

There is computational capacity available all the way from local devices to the cloud, explained distributed AI researcher Lauri Lovén at the University of Oulu in Finland. Exactly where an AI model operates along this edge-cloud continuum depends in part on the use case. An autonomous vehicle that has to make a rapid, real-time traffic decision is better off eliminating the cloud latency and generating that result onboard. Conversely, a consumer photo-editing application powered by AI does not necessarily need to run on the user’s personal device—a latency-induced lag resulting from spotty bandwidth would be perfectly tolerable.

Raghubir Singh, an assistant professor of computer science at the U.K.’s University of Bath, suggests the nature of the given problem may determine where, along the continuum, a task is computed. “There will be a tradeoff. Some problems you will be able to solve locally using edge AI, while others will need GPUs with a lot more processing power,” he said. “Think of it like a primary school. You have a classroom teacher who can answer most of your questions, but then maybe there’s one that needs to be solved by someone outside the classroom.” A more complex problem or task could be pushed to a more robust AI model in the cloud for resolution.

Singh cites security and privacy as equally important variables. If a patient visits a local health center for a checkup, and machine learning algorithms can process data collected during the exam within that facility, then privacy concerns are minimized. “If that patient data is stored and assessed locally, and not going to the cloud, that sensitive information remains secure,” he explained.

Economics have become increasingly important as well. A personalized AI agent that understands your preferences might be too expensive to operate in the cloud, noted Berkeley’s Patil, because this would require reserving high-end cloud compute capacity just for the individual user, who would have to pay the compute costs to keep the model ready even when it is not in use. On the other hand, the costs of a general AI model maintained in the cloud for mass consumption can be distributed more efficiently across all its users. As a result, Patil and his colleagues are developing techniques to shrink, fine-tune, and personalize models so they can run on consumer hardware like smartphones and other edge devices.

The cost of operating the models in the cloud is also incentivizing AI leaders and cloud giants to push more tasks out to the edge, explained Patil. As advanced AI models grow larger, they consume more electricity and drive up the price of each inference. The most advanced large language models (LLMs) are one obvious example, according to Patil. “LLM inference today is super expensive,” he said. “It’s in the tens of cents per inference, if not higher, especially for the bigger models. Basically, the cloud providers are going to be bleeding money if they give away inferences for free.” But if more AI tools operate on edge devices like smartphones, the cloud provider will assume less of the electricity bill. Instead, the user will pay that cost by charging their phones more often.

The latest smartphones—considered the extreme edge—are now being optimized for edge AI. The recently released Pixel phones incorporate new Google Tensor G4 chips designed specifically for AI workloads. The models running on these phones are also shrinking. When Apple announced its Apple Intelligence features, the company noted that the language model running on the device would have roughly 3 billion parameters, compared to cloud-based models that run to hundreds of billions or even a trillion-plus parameters. Similarly, Meta’s Llama 3.2 release included lightweight models with 1 billion and 3 billion parameters designed to run on devices.

These smaller, purpose-built models are far more suitable to the edge. An advanced generative text-to-video model like OpenAI’s Sora requires significant cloud compute capacity, but simpler AI-enhanced tasks like transforming text to speech could be done with lightweight, efficient models. “You don’t need a Lamborghini to cross the road,” said Patil.

Electricity consumption is not just an economics problem, but an environmental one. Suzan Bayhan, an associate professor on the Faculty of Electrical Engineering, Mathematics and Computer Science at the University of Twente in the Netherlands, points to the sustainability of AI-specific edge devices or edge accelerators—technologies optimized for edge AI processing. “Deploying smarter systems means you want to collect, process, and act on data, and you want this computation to be closer to the user,” she explains. “But if you are running this computation, especially some sort of AI model, on devices that were not actually optimized for these kinds of workloads, they will use a lot of energy.”

The creation of new and more efficient devices could alleviate that problem, she noted, yet this will drive up consumption of raw materials and could lead to a different kind of sustainability problem because of newly outdated, discarded devices. Plus, even though AI models are driving up electricity consumption in datacenters—Goldman Sachs estimates a 160% increase by 2030—the cloud still allows for the potential to control the problem. “When you have cloud or more centralized resources, then you are benefiting from the efficiency of sharing among different parties,” Bayhan explained, “but if we move to devices with edge accelerators, we are limiting this benefit of economies of scale and multi-tenant operation.”

Not all applications will move out to that extreme edge, and researchers are working on novel methods of determining where different tasks might be completed or even how to break down and spread out different tasks along the edge-cloud continuum. “We want to create a platform that allows you to distribute applications within this compute continuum and dynamically optimize for, say, latency, computational capacity, money, or resource usage,” said the University of Oulu’s Lovén. “Based on the situation, the platform would rebalance or readjust the distribution of components.”

This decision of where the processing takes place could also be left to individuals, said Bayhan. Her research group is developing methods that would give control to the users, so they can decide where the computation is done based on their own particular needs or preferences. A user could prioritize sustainability, fast performance, or personal security and privacy. “For instance,” Bayhan explained, “if it is my health data, I may not want this computation to be done in the cloud, but closer to my home or on my personal device, for instance, my smartwatch.”

Although Singh notes there is some hype surrounding edge AI at the moment, he suspects the combination of proliferating smart systems and devices and continued advances in AI will lead to major edge AI contributions and breakthroughs in the next five to 10 years. “There are so many research avenues with edge AI,” said Singh. “It’s such an interesting field, and I hope this work will bring a lot of benefits and advantages and allow more people to utilize edge AI models.”

Further Reading

  • Meuser, T., Lovén, L., Bhuyan, M., Patil, S., Dustdar, S., Aral, A. et al.
    Revisiting Edge AI: Opportunities and Challenges, IEEE Internet Computing 28, No. 4 (2024): 49-59.
  • Singh, R. and Gill, S.S.
    Edge AI: a survey, Internet of Things and Cyber-Physical Systems 3 (2023): 71-92.
  • Gunter, T., Wang, Z., Wang, C., Pang, R., et al.
    Apple intelligence foundation language models, arXiv preprint arXiv:2407.21075 (2024).
  • Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., and Zhang, J.
    Edge intelligence: Paving the last mile of artificial intelligence with edge computing, Proc. IEEE, Vol. 107, No. 8, pp. 1738–1762, Aug. 2019.
  • Arulraj, J., Chatterjee, A., Daglis, A., Dhekne, A., and Ramachandran, U.
    eCloud: A Vision for the Evolution of the Edge-Cloud Continuum, in Computer, Vol. 54, No. 5, pp. 24-33, May 2021.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

边缘人工智能 AI 云计算 可持续性
相关文章