钛媒体:引领未来商业与生活新知 01月27日
ByteDance and OpenAI Race to Develop AI Agents, Nvidia Partner Predicts AI Apps Will Account for 84% by 2028
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了人工智能领域中AI智能体的快速发展,特别是字节跳动推出的新一代自动化模型UI-TARS和OpenAI的Operator。UI-TARS集成了视觉理解、文本处理、任务规划和记忆管理等关键功能,能够执行复杂的跨平台任务。OpenAI的Operator则为用户提供购物、订餐等日常任务的自动化操作。文章还提到其他公司如智谱AI和Verses也加入了AI智能体竞赛。AI智能体被认为是通向通用人工智能(AGI)的关键,并将在各行业广泛应用,市场潜力巨大。预计到2025年,AI智能体将成为商业领域的重要组成部分。

🤖 UI-TARS:字节跳动发布的下一代自动化模型,拥有70亿参数,集成了视觉理解、文本处理、任务规划和记忆管理等功能,可以在不同的平台上执行复杂的任务,如自动发布推文。

🛒 Operator:OpenAI推出的首款AI智能体应用,主要面向美国ChatGPT Pro用户,能够模拟人类在网络上的操作,执行购物、订餐等任务,它通过结合GPT-4的视觉能力和强化学习,准确执行复杂步骤。

🚀 市场潜力:AI智能体市场正在经历指数级增长,预计中国市场到2028年将达到8520亿元人民币,年复合增长率高达72.7%。各行业都在积极探索AI智能体的应用,包括客户服务、编程、内容创作和财务管理等领域。

💡 AGI关键:AI智能体被认为是通向通用人工智能(AGI)的关键一步,它不仅能够思考,还能采取行动。OpenAI的AGI路线图将AI智能体置于第三阶段,介于推理AI和完全自主的创新系统之间。

🏢 企业应用:AI智能体在企业运营中具有巨大潜力,能够简化流程、降低人力成本,并为企业提供新的自动化机会。预计到2025年,77%的全球企业将部署生成式AI工具以提高生产力。

(Image Source: Photo by Lin Zhijia, TMTPost AGI Editor)

AsianFin -- In the ever-evolving world of artificial intelligence, the race to build AI agents is heating up. Following OpenAI’s launch of its first AI-powered agent application, “Operator,” ByteDance on Sunday launched its next-generation automation model, UI-TARS, on GitHub.

With seven billion parameters, this AI agent integrates crucial components such as visual understanding, text processing, task planning, and memory management into one unified model.

UI-TARS can perform complex, cross-platform tasks, perceiving user interfaces, reasoning through action steps, and interacting with web interfaces in ways previously thought to be exclusive to human operators.

While still in its preview phase and undergoing constant updates, UI-TARS has already made its mark by demonstrating the ability to “automatically” publish tweets, as seen in the official promotional video. Although the system currently requires human assistance for certain steps, such as inputting text and clicking through options, its potential is unmistakable. The model is already available for macOS and Windows users. 

 

 

The Operator Revolution

Only two days earlier, OpenAI introduced its first AI agent, “Operator.” Aimed at U.S. ChatGPT Pro users with a monthly subscription of $200, Operator is a digital assistant capable of simulating human operations on the web. It can perform tasks such as shopping, ordering food, and organizing papers by seamlessly integrating visual recognition and advanced reasoning models. By using a combination of GPT-4’s visual capabilities and reinforcement learning, the AI agent plans complex steps and takes actions with impressive accuracy.

The proliferation of AI agents in recent months has been nothing short of remarkable. Other notable players, including Zhipu AI and Genius by Verses, have joined the AI agent race. Zhipu’s AutoGLM and GLM-PC have garnered attention, while Genius—an AI agent that only needed two hours of training and a fraction of the data—has already surpassed human-level players in the classic Pong game.

Even Nvidia's CEO, Jensen Huang, weighed in at CES 2024, predicting that AI agents will be the next frontier of the robotics industry, with a potential value in the trillions of dollars. OpenAI’s CEO, Sam Altman, has also said that AI agents could become a significant force in 2025, heralding the beginning of a new era in AI applications. This suggests that 2025 could be a watershed year for AI agents, positioning them as a key area of technological growth.

A New Frontier in AI Development

AI agents are essentially intelligent entities that can autonomously perceive their environment, make decisions, and take action. Think of them as highly capable assistants that can understand tasks and help humans perform them more efficiently. For example, UI-TARS can act like a "smart assistant" that can navigate the web, recognize visual cues, plan the necessary steps, and execute complex actions—such as publishing content or making purchases—without human intervention.

The concept of AI agents began to take off after the success of ChatGPT in late 2022. Researchers at Stanford University and Google published a paper on “Generative Agents,” which described how virtual people in a simulated environment exhibited behaviors similar to humans when integrated with ChatGPT. This research sparked widespread interest in the idea of AI agents.

By 2024, AI agents hadbeen recognized as essential components in the development of Artificial General Intelligence (AGI). Stanford professor Andrew Ng has pointed out that AI agents will play a critical role in the progression toward AGI, describing them as systems that not only think but can also take action. OpenAI’s roadmap for AGI, which spans five stages, places AI agents at the third level, between reasoning AI and fully autonomous, innovative systems.

A recent report highlighted the exponential growth of the AI agent market in China. In 2023, the Chinese AI agent market was valued at 55.4 billion yuan, and it is projected to grow to 852 billion yuan by 2028, with an impressive compound annual growth rate of 72.7%. These projections underscore the immense potential of AI agents as an integral part of future industries.

AI Agents Across Industries

AI agents are rapidly gaining traction in various industries, from customer service to programming, content creation, and financial management. In content creation, for instance, AI agents can generate videos or even write scripts autonomously. This level of efficiency has led to broader adoption of AI assistants by creators, further cementing AI’s role as an indispensable tool in modern workflows.

Operator, for example, serves as a highly practical tool. It can perform everyday tasks such as making restaurant reservations, buying groceries, and even booking tickets for sports events. It employs a straightforward workflow in which it captures and analyzes screen content, adds the relevant information to its model context, and determines the next steps through reasoning. It then executes these steps using a virtual mouse and keyboard. The human user can intervene if necessary, particularly in situations involving sensitive information like payment details or addresses.

According to OpenAI, the Operator is designed to perform tasks independently for users, providing them with a smooth, automated experience. In a demonstration, the AI agent successfully completed various tasks with minimal input from the user. However, it pauses when handling sensitive tasks, such as payment, so users can take control when needed.

AI agents are also poised to make a major impact on enterprise operations. According to F5’s Mohan Veloo, AI applications will increasingly rely on APIs, and the growth of AI usage will lead to an explosion of these interfaces. By 2025, it’s expected that 77% of global enterprises will deploy generative AI tools to improve productivity, with over 84% of all applications incorporating AI inference capabilities by 2028.

AI agents can streamline processes, reduce human labor costs, and provide businesses with new opportunities for automation. However, as AI becomes more pervasive, some experts warn that AI’s democratization of knowledge may level the playing field, removing some of the competitive advantages previously held by leading firms.

For enterprises, the challenge lies in finding the most effective ways to integrate AI agents into their operations. As Zhang Xin from Volcano Engine noted, while AI models bring new productivity tools, they also introduce challenges related to managing the massive amounts of data generated by AI operations. Companies must focus on creating AI solutions that drive innovation while leveraging existing technologies.

The Future of AI Agents: From Adoption to Integration

In the coming years, the widespread adoption of AI agents will likely become a defining characteristic of business transformation. According to F5’s Veloo, the increasing fusion of AI technologies with IoT, edge computing, and cloud-native architecture is accelerating AI’s integration into business processes. This trend will drive enterprises to implement AI solutions that can seamlessly collaborate with human workers, boosting both productivity and efficiency.

In the second phase of AI’s revolution, AI agents like those from ByteDance, OpenAI, and other major players in the industry are pushing the boundaries of what’s possible. Whether it’s automating daily tasks or offering new solutions for business optimization, the future of AI agents looks incredibly promising.In 2025, AI agents are expected to become a significant part of the business landscape, offering a glimpse into the future of work. As the technology continues to evolve, it’s clear that AI agents will not just be tools—they will be invaluable partners in the journey toward a more intelligent and automated world.

更多精彩内容,关注钛媒体微信号(ID:taimeiti),或者下载钛媒体App

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI智能体 UI-TARS Operator AGI 自动化
相关文章