MarkTechPost@AI 06月24日
New AI Framework Evaluates Where AI Should Automate vs. Augment Jobs, Says Stanford Study
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

斯坦福大学的研究提出了一种新框架,旨在评估AI在工作中的自动化和辅助应用。该研究基于WORKBank数据库,结合了1500名领域工作者的偏好和52位AI专家的评估。通过引入Human Agency Scale(HAS),研究揭示了技术能力与工人意愿之间的差异。结果表明,工人欢迎AI自动化重复性任务,但抵制涉及创造力或人际交往的任务。该框架为负责任的AI部署提供了可行的见解,有助于AI开发、劳工政策和劳动力培训策略。

🤖 **AI代理重塑工作模式:** AI代理通过执行复杂、目标导向的任务,正在改变工作执行方式。它们结合了多步骤规划和软件工具,处理教育、法律、金融和物流等各个领域的完整工作流程。

🤝 **弥合AI能力与工人偏好之间的差距:** 研究强调了AI代理的能力与工人期望之间的脱节。即使AI系统在技术上能够接管任务,工人可能由于对工作满意度、任务复杂性或人类判断重要性的担忧而不支持这种转变。

📊 **WORKBank数据库:** 斯坦福大学的研究团队创建了WORKBank,这是一个基于调查的审计框架,评估了工人希望自动化或增强的任务,并将其与专家对AI能力的评估进行比较。该数据库包含了来自1500名领域工作者的反馈和52位AI专家的评估。

⚖️ **Human Agency Scale (HAS):** 研究引入了Human Agency Scale(HAS),一个衡量人类在任务完成中所需参与程度的五级指标。例如,H1或H2级别的任务(如转录数据或生成例行报告)非常适合AI独立执行,而H4或H5级别的任务(如规划培训项目或参与安全相关讨论)则需要高度的人工监督。

🚦 **AI部署的四个区域:** 通过将工人偏好和专家能力相结合,研究将任务分为四个区域:自动化“绿灯”区(高能力和高意愿)、自动化“红灯”区(高能力但低意愿)、研发机会区(低能力但高意愿)和低优先级区(低意愿和低能力)。

Redefining Job Execution with AI Agents

AI agents are reshaping how jobs are performed by offering tools that execute complex, goal-directed tasks. Unlike static algorithms, these agents combine multi-step planning with software tools to handle entire workflows across various sectors, including education, law, finance, and logistics. Their integration is no longer theoretical—workers are already applying them to support a variety of professional duties. The result is a labor environment in transition, where the boundaries of human and machine collaboration are being redefined on a daily basis.

Bridging the Gap Between AI Capability and Worker Preference

A persistent problem in this transformation is the disconnect between what AI agents can do and what workers want them to do. Even if AI systems are technically capable of taking over a task, workers may not support that shift due to concerns about job satisfaction, task complexity, or the importance of human judgment. Meanwhile, tasks that workers are eager to offload may lack mature AI solutions. This mismatch presents a significant barrier to the responsible and effective deployment of AI in the workforce.

Beyond Software Engineers: A Holistic Workforce Assessment

Until recently, assessments of AI adoption often centered on a handful of roles, such as software engineering or customer service, limiting understanding of how AI impacts broader occupational diversity. Most of these approaches also prioritized company productivity over worker experience. They relied on an analysis of current usage patterns, which does not provide a forward-looking view. As a result, the development of AI tools has lacked a comprehensive foundation grounded in the actual preferences and needs of people performing the work.

Stanford’s Survey-Driven WORKBank Database: Capturing Real Worker Voices

The research team from Stanford University introduced a survey-based auditing framework that evaluates which tasks workers would prefer to see automated or augmented and compares this with expert assessments of AI capability. Using task data from the U.S. Department of Labor’s O*NET database, researchers created the WORKBank, a dataset based on responses from 1,500 domain workers and evaluations from 52 AI experts. The team employed audio-supported mini-interviews to collect nuanced preferences. It introduced the Human Agency Scale (HAS), a five-level metric that captures the desired extent of human involvement in task completion.

Human Agency Scale (HAS): Measuring the Right Level of AI Involvement

At the center of this framework is the Human Agency Scale, which ranges from H1 (full AI control) to H5 (complete human control). This approach recognizes that not all tasks benefit from full automation, nor should every AI tool aim for it. For example, tasks rated H1 or H2—like transcribing data or generating routine reports—are well-suited for independent AI execution. Meanwhile, tasks such as planning training programs or participating in security-related discussions were often rated at H4 or H5, reflecting the high demand for human oversight. The researchers gathered dual inputs: workers rated their desire for automation and preferred HAS level for each task, while experts evaluated AI’s current capability for that task.

Insights from WORKBank: Where Workers Embrace or Resist AI

The results from the WORKBank database revealed clear patterns. Approximately 46.1% of tasks received a high desire for automation from workers, particularly those viewed as low-value or repetitive. Conversely, significant resistance was found in tasks involving creativity or interpersonal dynamics, regardless of AI’s technical ability to perform them. By overlaying worker preferences and expert capabilities, tasks were divided into four zones: the Automation “Green Light” Zone (high capability and high desire), Automation “Red Light” Zone (high capability but low desire), R&D Opportunity Zone (low capability but high desire), and Low Priority Zone (low desire and low capability). 41% of tasks aligned with companies funded by Y Combinator fell into the Low Priority or Red Light zones, indicating a potential misalignment between startup investments and worker needs.

Toward Responsible AI Deployment in the Workforce

This research offers a clear picture of how AI integration can be approached more responsibly. The Stanford team uncovered not only where automation is technically feasible but also where workers are receptive to it. Their task-level framework extends beyond technical readiness to encompass human values, making it a valuable tool for AI development, labor policy, and workforce training strategies.

TL;DR:

This paper introduces WORKBank, a large-scale dataset combining worker preferences and AI expert assessments across 844 tasks and 104 occupations, to evaluate where AI agents should automate or augment work. Using a novel Human Agency Scale (HAS), the study reveals a complex automation landscape, highlighting a misalignment between technical capability and worker desire. Findings show that workers welcome automation for repetitive tasks but resist it in roles requiring creativity or interpersonal skills. The framework offers actionable insights for responsible AI deployment aligned with human values.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post New AI Framework Evaluates Where AI Should Automate vs. Augment Jobs, Says Stanford Study appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI代理 工作自动化 Human Agency Scale WORKBank
相关文章