AWS Machine Learning Blog 2024年10月26日
How Planview built a scalable AI Assistant for portfolio and project management using Amazon Bedrock
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Planview作为连接工作管理解决方案的领先提供商,在2023年启动计划,利用亚马逊Bedrock开发AI助手Planview Copilot,以变革全球300万用户与项目管理应用的交互方式。文章探讨了开发过程中的多代理系统所面临的挑战及解决方案,包括任务路由、数据访问、API交互、新AI技能创建等,并详细介绍了其多代理架构及技术细节。

Planview致力于变革全球用户与项目管理应用的交互方式,2023年启动计划并开发AI助手Planview Copilot,利用亚马逊Bedrock的多代理系统,该系统面临多项挑战,如任务可靠路由、多源数据访问等。

为克服挑战,Planview构建了使用亚马逊Bedrock的多代理架构,其包括负责路由问题的协调器及多种类型的代理,如帮助代理、数据代理、行动代理,它们协作以提供准确全面的用户问题答案。

Planview使用关键AWS服务构建多代理架构,中央Copilot服务负责协调各项活动,包括管理用户会话聊天历史、协调流量、处理日志等,路由器和响应器是与亚马逊Bedrock交互的AWS Lambda函数。

根据不同用例和数据可用性,代理可通过多种方法与应用交互,如现有应用API、Amazon Athena等数据存储、Amazon Neptune的图数据、Amazon OpenSearch Service的文档RAG。

This post is co-written with Lee Rehwinkel from Planview.

Businesses today face numerous challenges in managing intricate projects and programs, deriving valuable insights from massive data volumes, and making timely decisions. These hurdles frequently lead to productivity bottlenecks for program managers and executives, hindering their ability to drive organizational success efficiently.

Planview, a leading provider of connected work management solutions, embarked on an ambitious plan in 2023 to revolutionize how 3 million global users interact with their project management applications. To realize this vision, Planview developed an AI assistant called Planview Copilot, using a multi-agent system powered by Amazon Bedrock.

Developing this multi-agent system posed several challenges:

To overcome these challenges, Planview developed a multi-agent architecture built using Amazon Bedrock. Amazon Bedrock is a fully managed service that provides API access to foundation models (FMs) from Amazon and other leading AI startups. This allows developers to choose the FM that is best suited for their use case. This approach is both architecturally and organizationally scalable, enabling Planview to rapidly develop and deploy new AI skills to meet the evolving needs of their customers.

This post focuses primarily on the first challenge: routing tasks and managing multiple agents in a generative AI architecture. We explore Planview’s approach to this challenge during the development of Planview Copilot, sharing insights into the design decisions that provide efficient and reliable task routing.

We describe customized home-grown agents in this post because this project was implemented before Amazon Bedrock Agents was generally available. However, Amazon Bedrock Agents is now the recommended solution for organizations looking to use AI-powered agents in their operations. Amazon Bedrock Agents can retain memory across interactions, offering more personalized and seamless user experiences. You can benefit from improved recommendations and recall of prior context where required, enjoying a more cohesive and efficient interaction with the agent. We share our learnings in our solution to help you understanding how to use AWS technology to build solutions to meet your goals.

Solution overview

Planview’s multi-agent architecture consists of multiple generative AI components collaborating as a single system. At its core, an orchestrator is responsible for routing questions to various agents, collecting the learned information, and providing users with a synthesized response. The orchestrator is managed by a central development team, and the agents are managed by each application team.

The orchestrator comprises two main components called the router and responder, which are powered by a large language model (LLM). The router uses AI to intelligently route user questions to various application agents with specialized capabilities. The agents can be categorized into three main types:

After the agents have processed the questions and provided their responses, the responder, also powered by an LLM, synthesizes the learned information and formulates a coherent response to the user. This architecture allows for a seamless collaboration between the centralized orchestrator and the specialized agents, which provides users an accurate and comprehensive answers to their questions. The following diagram illustrates the end-to-end workflow.

Technical overview

Planview used key AWS services to build its multi-agent architecture. The central Copilot service, powered by Amazon Elastic Kubernetes Service (Amazon EKS), is responsible for coordinating activities among the various services. Its responsibilities include:

The router and responder are AWS Lambda functions that interact with Amazon Bedrock. The router considers the user’s question and chat history from the central Copilot service, and the responder considers the user’s question, chat history, and responses from each agent.

Application teams manage their agents using Lambda functions that interact with Amazon Bedrock. For improved visibility, evaluation, and monitoring, Planview has adopted a centralized prompt repository service to store LLM prompts.

Agents can interact with applications using various methods depending on the use case and data availability:

The following diagram illustrates the generative AI assistant architecture on AWS.

Router and responder sample prompts

The router and responder components work together to process user queries and generate appropriate responses. The following prompts provide illustrative router and responder prompt templates. Additional prompt engineering would be required to improve reliability for a production implementation.

First, the available tools are described, including their purpose and sample questions that can be asked of each tool. The example questions help guide the natural language interactions between the orchestrator and the available agents, as represented by tools.

tools = '''<tool><toolName>applicationHelp</toolName><toolDescription>Use this tool to answer application help related questions.Example questions:How do I reset my password?How do I add a new user?How do I create a task?</toolDescription></tool><tool><toolName>dataQuery</toolName><toolDescription>Use this tool to answer questions using application data.Example questions:Which tasks are assigned to me?How many tasks are due next week?Which task is most at risk?</toolDescription></tool>

Next, the router prompt outlines the guidelines for the agent to either respond directly to user queries or request information through specific tools before formulating a response:

system_prompt_router = f'''<role>Your job is to decide if you need additional information to fully answer the User's questions.You achieve your goal by choosing either 'respond' or 'callTool'.You have access to your chat history in <chatHistory></chatHistory> tags.You also have a list of available tools to assist you in <tools></tools> tags.</role><chatHistory>{chatHistory}</chatHistory><tools>{tools}</tools><rules>- If the chat history contains sufficient information to answer the User's questions, choose the 'respond' action.- To gather more information before responding, choose the 'callTool' action.- You many only choose from the tools in the <tools></tools> tags.- If no tool can assist with the question, choose the 'respond' action.- Place your chosen action within <action></action> tags.- When you chose the 'callTool' action, provide the <toolName> and the <toolQuestion> youwould like to ask.- Your <toolQuestion> should be verbose and avoid using pronouns.- Start by providing your step-by-step thinking in <thinking></thinking> tags.- Then you will give your answer in <answer></answer> tags.- Your answer should follow the format of one of these three examples:When choosing the 'respond' action, your answer should follow the below example EXACTLY:<example><answer><action>respond</action></answer></example>When choosing the 'callTool' action for a single Tool:<example><answer><action>callTool</action><tools><tool><toolName>applicationHelp</toolName><toolQuestion>How do I reset my password?</toolQuestion></tool></tools></answer></example>Executing the above, would produce the following result:You can also call multiple Tools using this format:<example><answer><action>callTool</action><tools><tool><toolName>dataQuery</toolName><toolQuestion>How many tasks are assigned to me?</toolQuestion></tool><tool><toolName>applicationHelp</toolName><toolQuestion>How do I add a new task?</toolQuestion></tool></tools></answer></example></rules>'''

The following is a sample response from the router component that initiates the dataQuery tool to retrieve and analyze task assignments for each user:

<thinking>To determine who has the most tasks assigned, I will need to query the application data. The "dataQuery" tool seems most appropriate for this question.</thinking><answer><action>callTool</action><tools>    <tool>        <toolName>dataQuery</toolName>        <toolQuestion>Which user has the most tasks currently assigned to them?   </toolQuestion>    </tool></tools></answer>

The following is a sample response from the responder component that uses the dataQuery tool to fetch information about the user’s assigned tasks. It reports that the user has five tasks assigned to them.

<thinking>Based on the chat history, I previously called the dataQuery tool to ask "How many tasks are currently assigned to the user?". The tool responded that the user has 5 tasks assigned to them.</thinking><answer>According to the data I queried previously, you have 5 tasks assigned to you.</answer>

Model evaluation and selection

Evaluating and monitoring generative AI model performance is crucial in any AI system. Planview’s multi-agent architecture enables assessment at various component levels, providing comprehensive quality control despite the system’s complexity. Planview evaluates components at three levels:

The following figure illustrates the evaluation framework for prompts and scoring.

To conduct these evaluations, Planview uses a set of carefully crafted test questions that cover typical user queries and edge cases. These evaluations are performed during the development phase and continue in production to track the quality of responses over time. Currently, human evaluators play a crucial role in scoring responses. To aid in the evaluation, Planview has developed an internal evaluation tool to store the library of questions and track the responses over time.

To assess each component and determine the most suitable Amazon Bedrock model for a given task, Planview established the following prioritized evaluation criteria:

Based on these criteria and the current use case, Planview selected Anthropic’s Claude 3 Sonnet on Amazon Bedrock for the router and responder components.

Results and impact

Over the past year, Planview Copilot’s performance has significantly improved through the implementation of a multi-agent architecture, development of a robust evaluation framework, and adoption of the latest FMs available through Amazon Bedrock. Planview saw the following results between the first generation of Planview Copilot developed mid-2023 and the latest version:

Conclusion

In this post, we explored how Planview was able to develop a generative AI assistant to address complex work management process by adopting the following strategies:

Planview is migrating to Amazon Bedrock Agents, which enables the integration of intelligent autonomous agents within their application ecosystem. Amazon Bedrock Agents automate processes by orchestrating interactions between foundation models, data sources, applications, and user conversations.

As next steps, you can explore Planview’s AI assistant feature built on Amazon Bedrock and stay updated with new Amazon Bedrock features and releases to advance your AI journey on AWS.


About Authors

Sunil Ramachandra is a Senior Solutions Architect enabling hyper-growth Independent Software Vendors (ISVs) to innovate and accelerate on AWS. He partners with customers to build highly scalable and resilient cloud architectures. When not collaborating with customers, Sunil enjoys spending time with family, running, meditating, and watching movies on Prime Video.

Benedict Augustine is a thought leader in Generative AI and Machine Learning, serving as a Senior Specialist at AWS. He advises customer CxOs on AI strategy, to build long-term visions while delivering immediate ROI.As VP of Machine Learning, Benedict spent the last decade building seven AI-first SaaS products, now used by Fortune 100 companies, driving significant business impact. His work has earned him 5 patents.

Lee Rehwinkel is a Principal Data Scientist at Planview with 20 years of experience in incorporating AI & ML into Enterprise software. He holds advanced degrees from both Carnegie Mellon University and Columbia University. Lee spearheads Planview’s R&D efforts on AI capabilities within Planview Copilot. Outside of work, he enjoys rowing on Austin’s Lady Bird Lake.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Planview AI助手 多代理架构 项目管理
相关文章