AWS Machine Learning Blog 2024年09月12日
Generative AI-powered technology operations
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了如何利用AWS生成式AI解决方案(包括Amazon Bedrock、Amazon Q Developer和Amazon Q Business)来提升TechOps的效率,减少问题解决时间,改善客户体验,标准化操作流程,并增强知识库。生成式AI可以解释复杂情况,解决传统AI/ML方法无法处理的挑战。

😊 **事件管理:** 生成式AI可以监控系统并分析性能数据,预测问题,并在发生事件时生成初步文档,包括受影响的系统、潜在的根本原因和故障排除步骤。它还可以生成过去事件的汇总报告,帮助团队识别重复出现的问题和预防措施的机会。此外,它可以将来自不同服务提供商的传入维护通知格式化为标准格式,从而加快对即将进行的维护影响的理解。类似地,如果检测到异常,生成式AI可以自动生成发往服务提供商的外部案例。

🥳 **知识库管理:** 生成式AI可以帮助工程师自动创建操作文档,如标准操作流程(SOP)和补充文档,例如服务器加固、外部 IP 允许列表的安全策略和操作系统修补等。通过对现有 SOP 和类似内容的大型数据集进行训练,生成式 AI 系统可以理解这些类型文档中常用的结构和语言。然后,工程师可以向系统提供新流程的高级要求或参数,生成式 AI 可以自动生成格式正确的草稿文档,包括适当的部分、详细程度和术语。

🤩 **自动化:** 生成式 AI 可以协助工程师并自动执行某些任务,这些任务原本需要手动操作。它可以帮助生成用于重复自动化过程的脚本代码。通过对现有代码示例进行训练,生成式模型可以学习模式和语法。工程师可以提供需要自动执行内容的高级描述或规范,例如“生成一个脚本,备份并归档此目录中 30 天前的文件”。AI 模型可以根据其训练自动生成完成此任务的工作代码。

😏 **客户体验:** 生成式 AI 可以分析大量客户服务数据,例如通话记录和支持票证,并识别客户经常报告的问题模式。这种洞察力使运营团队能够在问题严重影响客户之前主动解决常见问题。生成式 AI 助手还可以自动执行许多例行服务任务,使人工代理能够专注于需要个性化的更复杂的问题。

😌 **员工效率:** 全天候的架构运营团队在非工作时间和夜间面临着维护员工效率的挑战,因为支持请求量较低。生成式 AI 助手可以帮助提高员工在这段时间的效率,并简化交接流程。助手可以接受历史支持对话的训练,以便独立理解和解决大部分例行查询。它可以通过消息平台与客户沟通,提供即时帮助。助手可以处理的简单请求,使团队能够专注于需要人类专业知识的复杂问题。AI 系统可以将它无法自行解决的任何查询升级给值班人员。这使得夜间和周末的工作人员能够以更少的干扰来工作。

😇 **报告:** 生成式 AI 有潜力帮助基础设施运营团队简化报告流程。通过使用对过去报告示例进行训练的 ML 算法,生成式 AI 系统可以根据来自监控系统和其他操作工具的传入数据自动生成报告草稿。这可以为团队节省大量时间,而这些时间原本用于将信息编译成标准化的报告格式。AI 生成的报告可以包括汇总数据可视化、描述性分析和针对每个接收者的定制建议。

😈 **成本优化:** 生成式 AI 可以帮助优化 IT 成本,通过预测性维护减少停机时间,自动执行任务,减少人工需求,并通过分析客户服务数据识别潜在的浪费和效率低下。

😇 **安全增强:** 生成式 AI 可以帮助提高 IT 安全性,通过识别可疑活动和异常行为来检测威胁,自动执行安全任务,例如漏洞扫描和补丁管理,并生成安全策略和程序。

😈 **创新推动:** 生成式 AI 可以帮助推动 IT 创新,通过自动化任务,释放团队的时间和精力,专注于更具战略意义的项目,例如开发新产品和服务,探索新技术和趋势。

😇 **数据分析和洞察:** 生成式 AI 可以帮助团队分析大量数据,识别趋势和模式,获得更深入的洞察力,并做出更明智的决策。

😈 **持续改进:** 生成式 AI 可以帮助团队持续改进其流程和操作,通过收集反馈,识别改进领域,并自动执行优化任务。

😇 **用户体验:** 生成式 AI 可以帮助提高用户体验,通过提供个性化的服务,解决问题更快,并提供更有效的支持。

😈 **竞争优势:** 生成式 AI 可以帮助团队获得竞争优势,通过提高效率,降低成本,改善客户体验和推动创新。

😇 **未来展望:** 生成式 AI 在 TechOps 领域具有巨大的潜力,未来将会在更多方面发挥作用,例如自动执行更复杂的任务,提供更智能的分析和洞察力,以及推动更具创新性的解决方案。

😈 **伦理和风险:** 在使用生成式 AI 时,需要考虑伦理和风险问题,例如数据隐私和安全,算法偏差和透明度,以及对工作岗位的影响。

😇 **最佳实践:** 在实施生成式 AI 解决方案时,需要遵循最佳实践,例如选择合适的工具和技术,制定明确的目标和指标,确保数据质量和安全,并定期评估和改进解决方案。

😈 **持续学习和发展:** 生成式 AI 技术正在不断发展,团队需要持续学习和发展,以掌握最新的技术和趋势,并利用这些技术来提高 TechOps 的效率和效益。

Technology operations (TechOps) refers to the set of processes and activities involved in managing and maintaining an organization’s IT infrastructure and services. There are several terminologies used with reference to managing information technology operations, including ITOps, SRE, AIOps, DevOps, and SysOps. For the context of this post, we refer to these terminologies as TechOps. This includes tasks such as managing servers, networks, databases, and applications to maintain reliability, performance, and security of IT systems. However, certain tasks require manual and repetitive efforts such as incident detection and response, analyzing incoming tickets from disparate service providers, finding standard operating procedures for known and unknown issues, and managing support case resolution. In recent years, TechOps has been using AI capabilities—called AIOps—for operational data collection, aggregation, and correlation to generate actionable insights, identity root causes, and more.

This post describes how AWS generative AI solutions (including Amazon Bedrock, Amazon Q Developer, and Amazon Q Business) can further enhance TechOps productivity, reduce time to resolve issues, enhance customer experience, standardize operating procedures, and augment knowledge bases. The ability of generative AI technology to interpret complex situations on a nuanced, case-by-case basis implies that generative AI can solve challenges that other approaches—including traditional artificial intelligence and machine learning (AI/ML)-based pattern matching—couldn’t handle. The following table depicts a few examples of how AWS generative AI services can help with some of the day-to-day TechOps activities.

Amazon Bedrock Amazon Q Developer Amazon Q Business
Root cause analysis Maintenance tasks code generation Standard operating procedure
Knowledge base creation Increase productivity and efficiency Organization policy and procedure
Recurring reporting . Customer experience and sentiment analysis
Outbound support case generation . Shift handover chatbot
Inbound maintenance notifications formatting . .

A typical day in the life of a TechOps team includes issue resolution, root cause analysis, maintenance activities, and updating knowledge bases to provide a positive customer experience. In the following sections, we discuss some of these areas and how generative AI can help enhance TechOps.

Event management

By monitoring systems and analyzing patterns in performance data, an AI model can predict issues before they cause outages or degraded service. When incidents do occur, generative AI models can generate preliminary documentation of the event, including details on impacted systems, potential root causes, and troubleshooting steps. This allows engineers to quickly get up to speed on new incidents and accelerate response efforts.

Generative AI can also generate summary reports of past incidents to help teams identify recurring problems and opportunities for preventative measures. Furthermore, it can help with formatting inbound maintenance notifications from various service providers into a standard format, which can speed up understanding the impact of upcoming maintenance. Similarly, generative AI can automatically generate outbound cases to service providers if it detects an anomaly.

By taking over basic documentation and prediction tasks, generative AI can help infrastructure teams spend less time on repetitive work and more time resolving issues to improve overall system reliability.

To learn more about using Amazon Bedrock for summary tasks, refer to Create summaries of recordings using generative AI with Amazon Bedrock and Amazon Transcribe. To learn how Wiz uses Amazon Bedrock to address security risks, see How Wiz is empowering organizations to remediate security risks faster with Amazon Bedrock. To learn how HappyFox uses Anthropic Claude in Amazon Bedrock, refer to HappyFox Automates Support Agent Responses with Claude in Amazon Bedrock, Increasing Ticket Resolution by 40%.

Knowledge base management

Generative AI has the potential to help engineers automatically create operational documents such as standard operating procedures (SOPs) and supplemental documents, such as server hardening, security policies for external IPs allow lists and operating system patching, and more.

Using natural language models trained on large datasets of existing SOPs and similar content, generative AI systems can understand the common structure and language used in these types of documents. Engineers can then provide the system with high-level requirements or parameters for a new procedure, and generative AI can automatically generate a draft document formatted with the appropriate sections, level of detail, and terminology. This allows engineers to spend less time on documentation and more time focused on other engineering tasks. The initial drafts from AI also provide a strong starting point that engineers can refine.

Overall, generative AI offers a more efficient way for engineers to develop standardized procedural content at scale.

To learn how to use Amazon Bedrock to generate product descriptions, see Automating product description generation with Amazon Bedrock. Additionally, refer to How Skyflow creates technical content in days using Amazon Bedrock to learn how Skyflow Inc.—a data privacy company—uses Amazon Bedrock to streamline the creation of technical content, reducing the process from weeks to days while maintaining the highest standards of data privacy and security.

Automation

Generative AI can assist engineers and automate certain tasks that would otherwise require manual work. One area this could help in is script code generation for repetitive automation processes. By training AI models on large datasets of existing code examples for common programming tasks like file operations or system configuration, generative models can learn patterns and syntax.

An Amazon Q customization is a set of elements that enables Amazon Q to provide you with suggestions based on your company’s code base. Engineers can then provide high-level descriptions or specifications of what they need automated, such as “Generate a script to back up and archive files older than 30 days in this directory.” The AI model would be able to produce working code to accomplish this automatically based on its training. This would save engineers considerable time writing and testing scripts for routine jobs, allowing them to focus on more creative and challenging aspects of their work. As generative AI techniques advance, more complex engineering automation may also be achieved.

Refer to Upgrade your Java applications with Amazon Q Code Transformation to learn about the Amazon Q Code Transformation feature. Also, refer to Using Amazon Bedrock Agents to interactively generate infrastructure as code to learn how to configure Amazon Bedrock Agents to generate infrastructure as code. Lastly, refer to TymeX Accelerates Clean Coding by 40% by Implementing Generative AI on AWS to learn how TymeX uses generative AI on AWS.

Customer experience

Generative AI can analyze large volumes of customer service data, like call logs and support tickets, and identify patterns in issues customers frequently report. This insight allows operations teams to proactively address common problems before they severely impact customers. Generative AI assistants can also automate many routine service tasks, freeing up human agents to focus on more complex inquiries that require personalization. With AI assistance, infrastructure services can be restored more quickly when outages occur. This helps make sure operations are more efficient and transparent, directly enhancing the experience for the customers that infrastructure teams aim to support.

Amazon Q Business offers a conversational experience with generative prompts and tasks that can act as a front-line support engineer, answering customer questions and resolving known issues efficiently. The feature can use data from enterprise systems to provide accurate and timely responses, reducing the burden on human engineers and improving customer satisfaction.

With Amazon Bedrock, you can perform sentiment analysis to help analyze customer emotions and provide context to human engineers, enabling them to provide better support and improve customer loyalty, retention, and growth.

Refer to Develop advanced generative AI chat-based assistants by using RAG and ReAct prompting to learn one way to develop generative AI assistants. Refer to Building a Generative AI Contact Center Solution for DoorDash Using Amazon Bedrock, Amazon Connect, and Anthropic’s Claude to learn how DoorDash built a generative AI contact center solution using AWS services. To learn how PGA TOUR built a generative AI virtual assistant, see The journey of PGA TOUR’s generative AI virtual assistant, from concept to development to prototype.

Staff productivity

An all-day infrastructure operations team faces challenges in maintaining staff productivity during off-hours and nights when the volume of support requests is lower. A generative AI assistant can help improve staff productivity in these periods and streamline the shift-handover process.

The assistant can be trained on historical support conversations to understand and resolve a large percentage of routine queries independently. It can communicate with customers on messaging platforms to provide instant assistance. Simple requests that the assistant can address free up the team to focus on complex issues requiring human expertise. The AI system can escalate any queries it can’t resolve on its own to the on-call staff. This allows the night and weekend crew to work with fewer interruptions. They can work through tasks more efficiently knowing the assistant is handling basic support needs independently. Generative AI-powered contact center solutions can improve an agent’s ability to interact with customers more precisely and speed up issue resolution, increasing overall productivity.

To learn how to automate document and data retrieval for AI assistants, see Automate chatbot for document and data retrieval using Amazon Bedrock Agents and Knowledge Bases. Refer to How LeadSquared accelerated chatbot deployments with generative AI using Amazon Bedrock and Amazon Aurora PostgreSQL to learn how LeadSquared uses Amazon Bedrock and Amazon Aurora PostgreSQL-Compatible Edition to deploy generative AI-powered assistants on their Converse platform, which personalize interactions based on customer-specific training data. This integration reduces customer onboarding costs, minimizes manual effort, and improves chatbot responses, transforming customer support and engagement by providing swift and relevant assistance.

Reporting

Generative AI has the potential to help infrastructure operations teams streamline reporting processes. By using ML algorithms trained on past report examples, a generative AI system can automatically generate draft reports based on incoming data from monitoring systems and other operational tools. This can save teams significant time spent compiling information into standardized report formats. The AI-generated reports could include summary data visualizations, descriptive analyses, and recommendations tailored to each recipient.

Teams would still need to review the drafts for accuracy before finalizing and distributing them. However, having an initial version generated automatically could cut down on routine reporting tasks so engineers have more time for higher-value problem-solving and strategic planning work. The use of AI could help infrastructure teams meet their reporting obligations more efficiently.

Amazon Q in QuickSight is your generative AI assistant that makes it straightforward to build and consume insights. For more information, see Amazon Q is now generally available in Amazon QuickSight, bringing Generative BI capabilities to the entire organization. Also, refer to Anthology uses embedded analytics offered by Amazon QuickSight to democratize decision making for higher education to learn how Anthology is using Amazon Q in QuickSight to offer institutions self-serve options for analytics needs that aren’t directly addressed by the central dashboards.

You can explore more customer stories and case studies at Generative AI Customer Stories to learn how customers are using AWS generative AI services. Refer to Derive meaningful and actionable operational insights from AWS Using Amazon Q Business to learn how to use AWS generative AI services, like Amazon Q Business, with AWS Support cases, AWS Trusted Advisor, and AWS Health data to derive actionable insights based on common patterns, issues, and resolutions while using the AWS recommendations and best practices enabled by support data.

Conclusion

Integrating generative AI into TechOps represents a transformative leap in the management and optimization of IT infrastructure and services. By using AWS generative AI solutions such as Amazon Bedrock, Amazon Q Developer, and Amazon Q Business, organizations can significantly enhance productivity, reduce the time required to resolve issues, and improve overall customer experience. Generative AI’s sophisticated capabilities in predicting and preventing outages, automating documentation, and generating actionable insights from operational data position it as a critical tool for modern TechOps teams.

You can unlock unimagined possibilities with generative AI by using the AWS Generative AI Innovation Center program, which pairs you with AWS science and strategy experts with deep experience in AI/ML and generative AI techniques. To get started, contact your AWS Account Manager. If you don’t have an AWS Account Manager, contact AWS Sales.


About the Authors

Raman Pujani is a Solutions Architect at Amazon Web Services, where he helps customers to accelerate their business transformation journey with AWS. He builds simplified and sustainable solutions for complex business problems with innovative technology. Raman has 25+ years of experience in IT Transformation. Besides work, he enjoys spending time with family, vacationing in the mountains, and music.

Rachanee Singprasong is a Principal Customer Solutions Manager in Strategic Accounts at Amazon Web Services. Her role is focused on enabling customer in their cloud adoption and digital transformation journey. She has a Ph.D. in Operations Research and her passion is to solve complex customer challenges using creative solutions.

Vijay Sivaji is a Senior Technical Account Manager in Strategic Accounts at Amazon Web Services. He helps customers in solving architectural, operational and cost optimization challenges. In his spare time he enjoys playing tennis.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

生成式AI TechOps AWS Amazon Bedrock Amazon Q Developer Amazon Q Business 事件管理 知识库管理 自动化 客户体验 员工效率 报告 成本优化 安全增强 创新推动 数据分析 持续改进 用户体验 竞争优势 未来展望 伦理和风险 最佳实践 持续学习
相关文章