AWS Blogs 18小时前
Introducing Amazon Application Recovery Controller Region switch: A multi-Region application recovery service
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon Application Recovery Controller (ARC) Region switch是一项全新的全托管服务,旨在帮助企业简化和自动化跨AWS区域的故障转移操作。该服务允许用户创建详细的恢复计划,定义切换应用程序操作到另一个AWS区域所需的具体步骤,并支持多种执行块,如EC2 Auto Scaling、Aurora Global Database切换、路由控制更新以及自定义Lambda函数执行等。通过自动化协调和执行这些任务,ARC Region switch消除了传统手动脚本的复杂性,提高了区域故障转移的可靠性和可预测性,使企业能够更有信心地应对区域性事件,确保业务的连续性。

🌐 **自动化跨区域恢复流程**:Amazon ARC Region switch提供了一个集中的解决方案,用于协调和自动化在需要将应用程序操作从一个AWS区域切换到另一个AWS区域时的恢复任务。它通过定义详细的恢复计划,包含一系列可配置的执行块,如EC2 Auto Scaling、路由控制、数据库故障转移、手动审批和自定义Lambda函数调用,从而简化了复杂的跨区域恢复操作,减少了对手动脚本的依赖。

⚙️ **灵活的恢复计划构建**:用户可以利用ARC Region switch创建包含多种执行块的恢复计划,这些执行块可以按顺序或并行执行,以协调不同应用程序或资源的恢复。支持的执行块类型包括ARC Region switch计划执行、Amazon EC2 Auto Scaling、ARC路由控制、Amazon Aurora Global Database、手动审批、AWS Lambda自定义操作、Amazon Route 53健康检查、Amazon EKS和Amazon ECS资源扩展。这种灵活性使用户能够根据其应用程序的特定需求定制恢复策略。

📈 **成本与可靠性平衡的容量管理**:ARC Region switch允许用户通过缩放执行块来灵活配置目标区域中待机资源的准备程度,以平衡成本和可靠性。用户可以设置目标区域在恢复期间的计算容量百分比,例如,对于预期在恢复期间流量激增的关键应用,可以设置为超过100%的容量;而较低的百分比则有助于实现更快的整体执行时间。但需要注意的是,容量的实际可用性取决于目标区域在恢复时的资源情况。

📊 **持续验证与实践保障**:该服务每30分钟会自动验证恢复计划,检查资源配置和IAM权限,并在执行期间监控每个步骤的进度并提供详细日志。此外,ARC Region switch还强调了定期在测试场景中执行计划的重要性,以验证其有效性、了解实际恢复时间并确保团队熟悉恢复流程,从而增强对灾难恢复策略的信心。

🔗 **跨账户和组织管理**:ARC Region switch支持将资源托管在与Region switch计划不同步的账户中,通过`executionRole`和`crossAccountRole`实现跨账户访问。此外,还可以利用AWS Resource Access Manager (AWS RAM) 将Region switch计划集中并跨多个账户共享,从而实现组织内恢复计划的有效管理。

<table id="amazon-polly-audio-table"><tbody><tr><td id="amazon-polly-audio-tab"><p></p></td></tr></tbody></table><p>As a developer advocate at AWS, I’ve worked with many enterprise organizations who operate critical applications across multiple <a href="https://docs.aws.amazon.com/glossary/latest/reference/glos-chap.html#region&quot;&gt;AWS Regions</a>. A key concern they often share is the lack of confidence in their Region failover strategy—whether it will work when needed, whether all dependencies have been identified, and whether their teams have practiced the procedures enough. Traditional approaches often leave them uncertain about their readiness for Regional switch.</p><p>Today, I’m excited to announce <a href="https://aws.amazon.com/application-recovery-controller/&quot;&gt;Amazon Application Recovery Controller (ARC)</a> Region switch, a fully managed, highly available capability that enables organizations to plan, practice, and orchestrate Region switches with confidence, eliminating the uncertainty around cross-Region recovery operations. Region switch helps you orchestrate recovery for your multi-Region applications on AWS. It gives you a centralized solution to coordinate and automate recovery tasks across AWS services and accounts when you need to switch your application’s operations from one AWS Region to another.</p><p>Many customers deploy business-critical applications across multiple AWS Regions to meet their availability requirements. When an operational event impacts an application in one Region, switching operations to another Region involves coordinating multiple steps across different AWS services, such as compute, databases, and DNS. This coordination typically requires building and maintaining complex scripts that need regular testing and updates as applications evolve. Additionally, orchestrating and tracking the progress of Region switches across multiple applications and providing evidence of successful recovery for compliance purposes often involves manual data gathering.</p><p>Region switch is built on a Regional data plane architecture, where Region switch plans are executed from the Region being activated. This design eliminates dependencies on the impacted Region during the switch, providing a more resilient recovery process since the execution is independent of the Region you’re switching from.</p><p><strong>Building a recovery plan with ARC Region switch<br /></strong> With ARC Region switch, you can create recovery plans that define the specific steps needed to switch your application between Regions. Each plan contains execution blocks that represent actions on AWS resources. At launch, Region switch supports nine types of execution blocks:</p><ul><li>ARC Region switch plan execution block–let you orchestrate the order in which multiple applications switch to the Region you want to activate by referencing other Region switch plans.</li><li><a href="https://aws.amazon.com/ec2/autoscaling/&quot;&gt;Amazon EC2 Auto Scaling</a> execution block–Scales Amazon EC2 compute resources in your target Region by matching a specified percentage of your source Region’s capacity.</li><li>ARC <a href="https://docs.aws.amazon.com/r53recovery/latest/dg/routing-control.html&quot;&gt;routing controls</a> execution block–Changes routing control states to redirect traffic using DNS health checks.</li><li><a href="https://aws.amazon.com/rds/aurora/&quot;&gt;Amazon Aurora</a> global database execution block–Performs database failover with potential data loss or switchover with zero data loss for <a href="https://aws.amazon.com/rds/aurora/global-database/&quot;&gt;Aurora Global Database</a>.</li><li>Manual approval execution block–Adds approval checkpoints in your recovery workflow where team members can review and approve before proceeding.</li><li>Custom Action <a href="https://aws.amazon.com/lambda/&quot;&gt;AWS Lambda</a> execution block–Adds custom recovery steps by executing Lambda functions in either the activating or deactivating Region.</li><li><a href="https://aws.amazon.com/route53/&quot;&gt;Amazon Route 53</a> health check execution block–Let you to specify which Regions your application’s traffic will be redirected to during failover. When executing your Region switch plan, the Amazon Route 53 health check state is updated and traffic is redirected based on your DNS configuration.</li><li><a href="https://aws.amazon.com/eks/&quot;&gt;Amazon Elastic Kubernetes Service (Amazon EKS)</a> resource scaling execution block–Scales Kubernetes pods in your target Region during recovery by matching a specified percentage of your source Region’s capacity.</li><li><a href="https://aws.amazon.com/ecs/&quot;&gt;Amazon Elastic Container Service (Amazon ECS)</a> resource scaling execution block–Scales ECS tasks in your target Region by matching a specified percentage of your source Region’s capacity.</li></ul><p>Region switch continually validates your plans by checking resource configurations and <a href="https://aws.amazon.com/iam/&quot;&gt;AWS Identity and Access Management (IAM)</a> permissions every 30 minutes. During execution, Region switch monitors the progress of each step and provides detailed logs. You can view execution status through the Region switch dashboard and at the bottom of the execution details page.</p><p>To help you balance cost and reliability, Region switch offers flexibility in how you prepare your standby resources. You can configure the desired percentage of compute capacity to target in your destination Region during recovery using Region switch scaling execution blocks. For critical applications expecting surge traffic during recovery, you might choose to scale beyond 100 percent capacity, and setting a lower percentage can help achieve faster overall execution times. However, it’s important to note that using one of the scaling execution blocks does not guarantee capacity, and actual resource availability depends on the capacity in the destination Region at the time of recovery. To facilitate the best possible outcomes, we recommend regularly testing your recovery plans and maintaining appropriate <a href="https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html&quot;&gt;Service Quotas</a> in your standby Regions.</p><p>ARC Region switch includes a global dashboard you can use to monitor the status of Region switch plans across your enterprise and Regions. Additionally, there’s a Regional executions dashboard that only displays executions within the current console Region. This dashboard is designed to be highly available across each Region so it can be used during operational events.</p><p>Region switch allows resources to be hosted in an account that is separate from the account that contains the Region switch plan. If the plan uses resources from an account that is different from the account that hosts the plan, then Region switch uses the <code>executionRole</code> to assume the <code>crossAccountRole</code> to access those resources. Additionally, Region switch plans can be centralized and shared across multiple accounts using <a href="https://docs.aws.amazon.com/ram/latest/userguide/what-is.html&quot;&gt;AWS Resource Access Manager (AWS RAM)</a>, enabling efficient management of recovery plans across your organization.</p><p><strong>Let’s see how it works<br /></strong> Let me show you how to create and execute a Region switch plan. There are three parts in this demo. First, I create a Region switch plan. Then, I define a workflow. Finally, I configure the triggers.</p><p><strong>Step 1: Create a plan</strong></p><p>I navigate to the Application Recovery Controller section of the <a href="https://console.aws.amazon.com&quot;&gt;AWS Management Console</a>. I choose <strong>Region switch</strong> in the left navigation menu. Then, I choose <strong>Create Region switch plan</strong>.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-15-47.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98498" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-15-47.png&quot; alt="ARC Region switch - 1" width="1600" height="986" /></a></p><p>After I give a name to my plan, I specify a <strong>Multi-Region recovery approach</strong> (active/passive or active/active). In Active/Passive mode, two application replicas are deployed into two Regions, with traffic routed into the active Region only. The replica in the passive Region can be activated by executing the Region switch plan.</p><p>Then, I select the <strong>Primary Region</strong> and <strong>Standby Region</strong>. Optionally, I can enter a <strong>Desired recovery time objective (RTO)</strong>. The service will use this value to provide insight into how long Region switch plan executions take in relation to my desired RTO.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-17-29.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98497" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-17-29.png&quot; alt="ARC Region switch - create plan" width="1600" height="1404" /></a></p><p>I enter the <strong>Plan execution IAM role</strong>. This is the role that allows Region switch to call AWS services during execution. I make sure the role I choose has permissions to be invoked by the service and contains the minimum set of permissions allowing ARC to operate. Refer to the <a href="https://docs.aws.amazon.com/r53recovery/latest/dg/security_iam_service-with-iam.html&quot;&gt;IAM permissions section of the documentation</a> for the details.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-18-09.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98496" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-18-09.png&quot; alt="ARC Region switch - create plan 2" width="1600" height="888" /></a><strong>Step 2: Create a workflow</strong></p><p>When the two <strong>Plan evaluation status</strong> notifications are green, I create a workflow. I choose <strong>Build workflows</strong> to get started.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-18-32.png&quot;&gt;&lt;br /><img class="aligncenter size-full wp-image-98495" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-18-32.png&quot; alt="ARC Region switch - status" width="1600" height="860" /></a></p><p>Plans enable you to build specific workflows that will recover your applications using Region switch execution blocks. You can build workflows with execution blocks that run sequentially or in parallel to orchestrate the order in which multiple applications or resources recover into the activating Region. A plan is made up of these workflows that allow you to activate or deactivate a specific Region.</p><p>For this demo, I use the graphical editor to create the workflow. But you can also define the workflow in JSON. This format is better suited for automation or when you want to store your workflow definition in a source code management system (SCMS) and your infrastructure as code (IaC) tools, such as <a href="https://aws.amazon.com/cloudformation/&quot;&gt;AWS CloudFormation</a>.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-49-22.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98502" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-49-22.png&quot; alt="ARC - define workflows" width="1600" height="1038" /></a></p><p>I can alternate between the <strong>Design</strong> and the <strong>Code</strong> views by selecting the corresponding tab next to the <strong>Workflow builder</strong> title. The JSON view is read-only. I designed the workflow with the graphical editor and I copied the JSON equivalent to store it alongside my IaC project files.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-49-35.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98501" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_10-49-35.png&quot; alt="ARC - define workflows as code" width="1600" height="1008" /></a></p><p>Region switch launches an evaluation to validate your recovery strategy every 30 minutes. It regularly checks that all actions defined in your workflows will succeed when executed. This proactive validation assesses various elements, including IAM permissions and resource states across accounts and Regions. By continually monitoring these dependencies, Region switch helps ensure your recovery plans remain viable and identifies potential issues before they impact your actual switch operations.</p><p>However, just as an untested backup is not a reliable backup, an untested recovery plan cannot be considered truly validated. While continuous evaluation provides a strong foundation, we strongly recommend regularly executing your plans in test scenarios to verify their effectiveness, understand actual recovery times, and ensure your teams are familiar with the recovery procedures. This hands-on testing is essential for maintaining confidence in your disaster recovery strategy.</p><p><strong>Step 3: Create a trigger</strong></p><p>A trigger defines the conditions to activate the workflows just created. It’s expressed as a set of CloudWatch alarms. Alarm-based triggers are optional. You can also use Region switch with manual triggers.</p><p>From the Region switch page in the console, I choose the <strong>Triggers</strong> tab and choose <strong>Add triggers</strong>.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_11-12-54.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98504" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_11-12-54.png&quot; alt="ARC - Trigger" width="1600" height="862" /></a></p><p>For each Region defined in my plan, I choose <strong>Add trigger</strong> to define the triggers that will activate the Region.<a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_11-13-21.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98505" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_11-13-21.png&quot; alt="ARC - Trigger 2" width="1600" height="674" /></a>Finally, I choose the alarms and their state (OK or Alarm) that Region switch will use to trigger the activation of the Region.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_11-15-20.png&quot;&gt;&lt;img class="aligncenter size-full wp-image-98506" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/17/2025-07-17_11-15-20.png&quot; alt="ARC - Trigger 3" width="1572" height="1432" /></a></p><p>I’m now ready to test the execution of the plan to switch Regions using Region switch. It’s important to execute the plan from the Region I’m activating (the target Region of the workflow) and use the data plane in that specific Region.</p><p>Here is how to execute a plan using the <a href="https://aws.amazon.com/cli/&quot;&gt;AWS Command Line Interface (AWS CLI)</a>:</p><pre class="lang-bash">aws arc-region-switch start-plan-execution --plan-arn arn:aws:arc-region-switch::111122223333:plan/resource-id --target-region us-west-2 --action activate</pre><p><strong>Pricing and availability<br /></strong> Region switch is available in all commercial AWS Regions at $70 per month per plan. Each plan can include up to 100 execution blocks, or you can create parent plans to orchestrate up to 25 child plans.</p><p>Having seen firsthand the engineering effort that goes into building and maintaining multi-Region recovery solutions, I’m thrilled to see how Region switch will help automate this process for our customers. To get started with ARC Region switch, <a href="https://console.aws.amazon.com/route53recovery/home&quot;&gt;visit the ARC console and create your first Region switch plan</a>. For more information about Region switch, visit the <a href="https://docs.aws.amazon.com/amazonarc/&quot;&gt;Amazon Application Recovery Controller (ARC) documentation</a>. You can also reach out to your AWS account team with questions about using Region switch for your multi-Region applications.</p><p>I look forward to hearing about how you use Region switch to strengthen your multi-Region applications’ resilience.</p><a href="https://linktr.ee/sebsto&quot;&gt;— seb</a>

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AWS ARC Region Switch 灾难恢复 高可用性 云基础设施
相关文章