未知数据源 2024年10月02日
Ensure availability of your data using cross-cluster replication with Amazon OpenSearch Service
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon OpenSearch Service是一项全托管服务,可在AWS云中经济高效地大规模部署和操作OpenSearch及传统Elasticsearch集群。它提供最新版本的OpenSearch,支持19个版本的Elasticsearch,具有可视化功能,还宣布支持跨集群复制,本文介绍了其相关内容及操作步骤。

🎯Amazon OpenSearch Service是全托管服务,能在AWS云中部署和操作OpenSearch及传统Elasticsearch集群,具有成本效益且可大规模应用。它提供最新版本的OpenSearch,支持多种Elasticsearch版本,并具备可视化能力。

🚀OpenSearch Service于2021年10月5日宣布支持跨集群复制,可在同一或不同AWS区域的域之间以低延迟复制索引,无需额外技术。它提供顺序一致性,能持续将数据从领导者索引复制到跟随者索引,典型交付时间少于一分钟,还可通过API持续监控复制状态。

💪跨集群复制在数据邻近性、灾难恢复和多集群模式等用例中很有帮助。例如,可将数据从一个区域复制到全球多个区域,在灾难恢复场景中,可在同一或不同区域设置一个或多个跟随者集群。此外,它支持主动/主动读取和主动/被动写入。

📋设置跨集群复制需完成多个步骤,包括创建两个跨区域的集群,在跟随者域创建出站连接请求,在领导者域批准入站连接,编辑安全配置,创建领导者索引,在跟随者域进行操作等。还介绍了暂停、停止复制,自动跟随,监控复制指标以及从故障中恢复的相关内容。

<section class="blog-post-content"><p><a href="https://aws.amazon.com/opensearch-service/&quot; target="_blank" rel="noopener noreferrer">Amazon OpenSearch Service</a> is a fully managed service that you can use to deploy and operate OpenSearch and legacy Elasticsearch clusters, cost-effectively, at scale in the AWS Cloud. The service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more by offering the latest versions of OpenSearch, suppor300t for 19 versions of Elasticsearch (1.5 to 7.10 versions), and visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions).</p><p>OpenSearch Service announced the support of <a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/replication.html&quot; target="_blank" rel="noopener noreferrer">cross-cluster replication</a> on October 5, 2021. With cross-cluster replication for OpenSearch Service, you can replicate indices at low latency from one domain to another in the same or different AWS Regions without needing additional technologies. Cross-cluster replication provides sequential consistency while continuously copying data from the leader index to the follower index. Sequential consistency ensures the leader and the follower return the same result set after operations are applied on the indices in the same order. Cross-cluster replication is designed to minimize delivery lag between the leader and the follower index. Typical delivery times are less than a minute. You can continuously monitor the replication status via APIs. Additionally, if you have indices that follow an index pattern, you can create automatic follow rules and they will be automatically replicated.</p><p>In this post, we show you how to use these features to ensure availability of your data using cross-cluster replication with OpenSearch Service.</p><h2>Benefits of cross-cluster replication</h2><p>Cross-cluster replication is helpful for use cases regarding data proximity, disaster recovery, and multi-cluster patterns.</p><p>Data proximity helps reduce latency and response time by bringing the data closer to your user or application server. For example, you can replicate data from one Region, <code>us-west-2</code> (leader), to multiple Regions across the globe acting as followers, <code>eu-west-1</code>, <code>ap-south-1</code>, <code>ca-central-1</code>, and so on, where the follower can poll the leader to sync new or updated data in the leader. In the following diagram, data is replicated from one production cluster in <code>us-west-2</code> to multiple locally available clusters near the user or application.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/21/BDB-2581-image001.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-34726" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/21/BDB-2581-image001.jpg&quot; alt="" width="1632" height="1088" /></a></p><p>In the case of disaster recovery, you can have one or more follower clusters in the same Region or different Regions, and as long as you have one active cluster, you can serve read requests to the users. In the following diagram, data is replicated from one production cluster to two different disaster recovery clusters.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/21/BDB-2581-image003.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-34727" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/21/BDB-2581-image003.jpg&quot; alt="" width="1496" height="892" /></a></p><p>As of today, cross-cluster replication supports active/active read and active/passive write, as shown in the following diagram.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/21/BDB-2581-image005.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-34728" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/21/BDB-2581-image005.jpg&quot; alt="" width="1180" height="826" /></a></p><p>With this implementation, you can solve the problem of read if your leader goes down, but what about write? As of this writing, cross-cluster replication doesn’t support any kind of failover mechanism to make your follower the leader. In this scenario, you might need to do some extra housekeeping to make your follower domain become the leader and start accepting write requests. This post shows the steps to set up cross-cluster replication and minimize downtime by advancing your follower to be leader.</p><h2>Set up cross-cluster replication</h2><p>To set up cross-cluster replication, complete the following steps:</p><ol><li>Create two clusters across two Regions, for example <code>leader-east</code> (leader) and <code>follower-west</code> (follower).Cross-cluster replication works on a pull model, where the user creates an outbound connection at the follower domain, and the follower keeps polling the leader to sync with new or updated documents for an index.</li><li>Go to the follower domain (<code>follower-west</code>) and create a request for an outbound connection. Specify the alias for this connection as <code>follower-west</code>.</li><li>Go to the leader domain, locate the inbound connection, and approve the incoming connection from <code>follower-west</code>.</li><li>Edit the security configuration and add the following access policy to allow <code>ESCrossClusterGet</code> in the leader domain, which is <code>leader-east</code>:</li><li>Create a leader index (on the leader domain), or ignore this step if you already have an index to replicate:</li><li>Navigate to OpenSearch Dashboards for the follower-west domain.</li><li>On the <strong>Dev Tools</strong> tab, run the following command (or use curl to connect directly):</li><li>Confirm the replication:</li><li>Index some documents in the leader index; the following command indexes documents to the catalog <code>index</code> with <code>id:1</code>:</li><li>Now go to follower domain and confirm the documents are replicated by running the following search query:</li></ol><h2>Pause and stop the replication</h2><p>When your replication is running, you can use these steps to pause and stop the replication.</p><p>You can use the following API to pause the replication, for example, while you debug an issue or load on the leader. Make sure to add an empty body with the request.</p><p>If you pause the replication, you must resume it within 12 hours. If you fail to resume it within 12 hours, you must stop replication, delete the follower index, and restart replication of the leader.</p><p>Stopping the replication makes the follower index unfollow the leader and become a standard index. Use the following code to stop replication:</p><p>Note that you can’t restart replication to this index after you stop it.</p><h2>Auto-follow</h2><p>You can define a set of replication rules against a single leader domain that automatically replicates indexes that match a specified pattern.</p><p>When an index on the leader domain matches one of the patterns (for example, logstash-*), a matching follower index is created on the follower domain. The following code is an example replication rule for <a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/replication.html#replication-autofollow&quot; target="_blank" rel="noopener noreferrer">auto-follow</a>:</p><p>Delete the replication rule to stop replicating new indexes that match the pattern:</p><h2>Monitor cross-cluster replication metrics</h2><p>OpenSearch Service provides metrics to monitor cross-cluster replication that can help you know the status of the replication along with its performance. For example, <code>ReplicationRate</code> can help you understand the average rate of replication operations per second, and <code>ReplicationNumSyncingIndices</code> can help you know the number of indexes with the replication status <code>SYNCING</code>. For more details about all the metrics provided by OpenSearch Service for cross-cluster replication, refer to <a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-cloudwatchmetrics.html#managedomains-cloudwatchmetrics-replication&quot; target="_blank" rel="noopener noreferrer">Cross-cluster replication metrics</a>.</p><h2>Recovering from failure</h2><p>At this point, we have two OpenSearch Service domains running in two different Regions. Let’s consider a scenario in which some disastrous event happens in the Region with your leader domain and the leader goes down. At this point, you can still serve read traffic from the follower domain, but no additional updates are applied because the follower can’t read from the leader. In this scenario, you can use the following steps to advance your follower to be leader:</p><ol><li>Go to your follower domain and stop replication:<p>After replication stops on the follower domain, your follower index acts as a normal index.</p></li><li>At this point, you can start sending write traffic to the follower.</li></ol><p>This way, you can advance your follower domain to become leader and route your write traffic to the follower, which helps avoid the data loss for new sets of changes and updates.</p><p>Keep in mind that there is a small lag (less than a minute) between the leader-follower sync. Additionally, there could be small amount of data loss in the follower domain that was indexed to the leader and not synced to the follower (especially when the leader went down and the follower didn’t have a chance to poll the changes and updates). For this scenario, you should have a mechanism in your ingest pipeline to replay the data to the follower when your leader goes down.</p><p>Now, what if the leader comes back online after a certain period of time. At this time, you can’t start the replication again from your follower to sync the delta to the leader. Even if you try to set up the replication from follower to leader, it will fail with an error. After you have used an index for a leader-follower connection, you can’t use same index again to create a new replication. So, what do you do now?</p><p>In this scenario, you can use the following steps to set up a leader-follower connection in the opposite direction:</p><ol><li>Delete the index from the old leader.</li><li>Set up cross-Region replication in the opposite direction with your new leader (<code>follower-west</code>) and new follower (<code>leader-east</code>).</li><li>Start the replication on the new follower (which was your old leader) and sync the data.</li></ol><p>This runs the sync for all data again for that index, and may take time depending upon the size of the index because it will bootstrap the index and start the replication from scratch. Additionally, you will incur standard <a href="https://aws.amazon.com/ec2/pricing/&quot; target="_blank" rel="noopener noreferrer">AWS data transfer costs</a> for the data transferred with this replication. This way, you can advance your follower (<code>follower-west</code>) to be leader and make your leader (<code>leader-east</code>) the new follower.</p><h2>Conclusion</h2><p>In this post, we showed you how you can use cross-cluster replication to sync data between leader and follower indices. We also demonstrated how you can advance your follower to become leader in case your leader goes down. This can help you serve traffic in the event of any disaster scenarios.</p><p>If you have feedback about this post, submit your comments in the comments section. If you have questions about this post, start a new thread on the <a href="https://forums.aws.amazon.com/forum.jspa?forumID=200&quot; target="_blank" rel="noopener noreferrer">Amazon OpenSearch Service forum</a> or <a href="https://console.aws.amazon.com/support/home&quot; target="_blank" rel="noopener noreferrer">contact AWS Support</a>.</p><h3><strong>About the Author</strong></h3><p class="c4"><strong><em><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/06/27/PRahsnat-Agrawal.jpg&quot;&gt;&lt;img class="size-full wp-image-31236 alignleft" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/06/27/PRahsnat-Agrawal.jpg&quot; alt="" width="100" height="159" /></a></em>Prashant Agrawal</strong> is a Search Specialist Solutions Architect with OpenSearch Service. He works closely with customers to help them migrate their workloads to the cloud and helps existing customers fine-tune their clusters to achieve better performance and save on cost. Before joining AWS, he helped various customers use OpenSearch and Elasticsearch for their search and log analytics use cases. When not working, you can find him traveling and exploring new places. In short, he likes doing Eat → Travel → Repeat.</p></section>

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon OpenSearch Service 跨集群复制 数据可用性 监控指标 故障恢复
相关文章