未知数据源 2024年10月02日
How Facteus improved Quantamatics performance by adopting Amazon Aurora Serverless and Amazon EKS
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Facteus是敏感交易数据可操作见解的领先提供商,其核心产品Quantamatics通过创新合成数据流程,将金融交易数据转化为可操作信息。文章介绍了Quantamatics的功能、Facteus的架构变化及优化,包括从Snowflake到Aurora Serverless v2的迁移等内容。

📊Facteus是敏感交易数据可操作见解的领先提供商,其产品Quantamatics是一个云基础的一站式研究平台,能加速用户从原始替代数据到见解的时间,且具有订阅模式。

🚀2021年6月,Facteus将其单体Quantamatics应用重新架构,使用微服务,从Snowflake迁移到Amazon Aurora Serverless v2,从Amazon EC2迁移到Amazon EKS,解决了原架构在维护、可扩展性和成本方面的问题。

🌟新架构通过使用Aurora Serverless v2和Amazon EKS,解决了数据传输时间和成本问题,提高了资源利用率,减少了停机时间和成本,还消除了节点补丁的开销。

📈Amazon EKS和Aurora Serverless v2在自动缩放方面表现出色,EKS可根据需求自动启动和关闭节点及API实例,Aurora Serverless v2可根据负载自动调整数据库的计算能力。

<section class="blog-post-content"><p><a href="https://www.facteus.com/&quot;&gt;Facteus Inc.</a> is a leading provider of actionable insights from sensitive transaction data. Facteus safely transforms raw financial transaction data from legacy technologies into actionable information, without compromising data privacy, through its innovative synthetic data process. Quantamatics is one of Facteus’ core product offering.</p><p>Quantamatics accelerates the time it takes a user to go from raw alternative data to insights, by providing a cloud-based, turnkey research platform that handles data from ingestion to analysis. This platform saves the analysts, data researchers, and data scientists time by doing all the preparation and normalization efforts prior to working with the data for insight discovery. The provided cloud environment also allows for easy and flexible analysis of both provided and external data sources. Quantamatics is a SaaS offering with a subscription model that provides access to both the research platform and the associated Facteus datasets.</p><p>In June 2021, Facteus re-architected their monolithic Quantamatics application to use microservices. This blog will contrast the before and after states from a performance and management perspective as they migrated from Snowflake to <a href="https://aws.amazon.com/rds/aurora/serverless/&quot;&gt;Amazon Aurora Serverless v2</a> (Postgres) and from <a href="https://aws.amazon.com/ec2/&quot;&gt;Amazon Elastic Compute Cloud (Amazon EC2)</a> to <a href="https://aws.amazon.com/eks/&quot;&gt;Amazon Elastic Kubernetes Service (Amazon EKS)</a>.</p><p>A great place to start when evaluating existing workloads for fault tolerance and reliability is the <a href="https://aws.amazon.com/architecture/well-architected/&quot;&gt;AWS Well-Architected Framework</a>. The Well-Architected Framework is designed to help cloud architects build secure, high-performing, resilient, and efficient infrastructure for their applications. Based on six pillars—operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability—the Framework provides a consistent approach for customers to evaluate architectures, and implement designs that will scale over time.</p><p>The <a href="https://aws.amazon.com/well-architected-tool/&quot;&gt;AWS Well-Architected Tool</a>,  available at no charge in the AWS Management Console, lets you create self-assessments to identify and correct gaps in your current architecture. Adhering to Well-Architected principles, Facteus adopted managed services, such as Amazon EKS and <a href="https://aws.amazon.com/rds/aurora/serverless/&quot;&gt;Amazon Aurora Serverless</a>, as they reduce efforts on provisioning, configuring, scaling, backing up, and so on. Additionally, using managed services helps to save on the overall costs of maintaining the services.</p><h2>Facteus’ architecture overview</h2><h4>Before</h4><p>Users can access Quantamatics for their research either through a Jupyter notebook or a Microsoft Excel plugin. Facteus used EC2 instances to directly host the underlying JupyterHub deployments and <a href="https://aws.amazon.com/elasticbeanstalk/&quot;&gt;AWS Elastic Beanstalk</a> to deploy APIs.</p><p>The legacy architecture, while cloud-based, had multiple issues that made it ineffective from a maintenance, scalability, and cost perspective (as demonstrated in Figure 1):</p><ul><li>JupyterHub does not currently support high availability (HA) natively. This meant an EC2 failover would require relatively long unavailability while a replacement EC2 node spun up or potentially double the cost to keep an idle node on standby.<ul><li>Also, with the EC2 instances being specialized, portions of each EC2 instance will remain unused, resulting in unnecessary costs compared to more modern solutions such as Amazon EKS, which can pool and divide up instances in a more granular fashion.</li><li>Finally, as the EC2 instances were standalone, solutions would need to be set up to both monitor application health and perform the appropriate actions in case of an outage.</li></ul></li><li>Although Elastic Beanstalk was a great way to deploy API instances in an HA and scalable way, to completely modernize and remain consistent throughout application to a microservice-based architecture, Facteus migrated their Elastic Beanstalk instances as well, to better utilize the pooled resources.</li></ul><div id="attachment_12132" class="wp-caption alignnone c4"><a href="https://d2908q01vomqb2.cloudfront.net/fc074d501302eb2b93e2554793fcaf50b3bf7291/2022/09/26/Figure-1.-1.png&quot;&gt;&lt;img aria-describedby="caption-attachment-12132" class="wp-image-12132 size-full" src="https://d2908q01vomqb2.cloudfront.net/fc074d501302eb2b93e2554793fcaf50b3bf7291/2022/09/26/Figure-1.-1.png&quot; alt="Cloud-based legacy architecture" width="977" height="555" /></a><p id="caption-attachment-12132" class="wp-caption-text">Figure 1. Cloud-based legacy architecture</p></div><p>Quantamatics requires a Data Warehouse solution to constantly run behind an API to allow for acceptable request and response times. While Snowflake is a great data warehousing and big data querying solution, Facteus found it expensive for their deployment. The queries that the Quantamatics APIs run are typically not computationally expensive but do end up returning relatively large amounts of data. This makes transferring the results back to the API over the internet a potential bottleneck.</p><p>To address these bottlenecks, Facteus re-architected their application into an Amazon EKS based one, backed with Aurora Serverless v2 (Postgres).</p><p>The new architecture resolves the previous problems in two ways (Figure 2):</p><ul><li>By using Aurora Serverless v2 (Postgres) to store and query the datasets used by the API within the same VPC instead of Snowflake, it kept the query run time relatively the same but drastically decreased both the transfer time and the associated costs due to the locality of the database as well as the cost and scalability of Aurora Serverless v2.</li><li>By switching to Amazon EKS, the underlying EC2 nodes could easily be pooled and more thoroughly utilized across the various deployments, thus reducing costs. Additionally, as the deployments were now containerized, an outage would result in the quick relocation of those containerized apps (pods) to nodes with capacity, thus reducing downtime and cost.<ul><li>As a side benefit with the move to managed nodes on Amazon EKS, this completely removed the node patching overhead, as Amazon EKS safely handles the patching of the underlying nodes with a single command.</li><li>Amazon EKS monitors and restarts pods automatically, which eliminated the need to set up and manage a solution that monitors pod health and takes the appropriate actions upon failures.</li></ul></li></ul><div id="attachment_12133" class="wp-caption alignnone c5"><a href="https://d2908q01vomqb2.cloudfront.net/fc074d501302eb2b93e2554793fcaf50b3bf7291/2022/09/26/Figure-2..png&quot;&gt;&lt;img aria-describedby="caption-attachment-12133" class="wp-image-12133 size-full" src="https://d2908q01vomqb2.cloudfront.net/fc074d501302eb2b93e2554793fcaf50b3bf7291/2022/09/26/Figure-2..png&quot; alt="Contemporary architecture with Amazon EKS and Aurora Serverless v2 (Postgres)" width="1135" height="542" /></a><p id="caption-attachment-12133" class="wp-caption-text">Figure 2. Contemporary architecture with Amazon EKS and Aurora Serverless v2 (Postgres)</p></div><h2>Auto scaling with Amazon EKS and Aurora Serverless</h2><ul><li>Amazon EKS helped to greatly reduce the overhead of setting up and managing the auto scaling of Quantamatics in two ways:<ul><li>User compute environments could be spun up as isolated pods, with Amazon EKS spinning nodes up and down automatically based on demand.</li><li>API instances could also be automatically spun up and down based on network throughput metrics queried by Amazon EKS to handle the requests made by users in a timely fashion.</li></ul></li><li>Aurora Serverless v2<ul><li>With Aurora Serverless v2, the needed compute capacity of the database automatically scales based on load generated by the corresponding API requests. This both reduced the cost as the load varies heavily throughout the day, reducing the management overhead of handling spinning up and down of read replicas if other solutions were used instead.</li></ul></li></ul><h2>Snowflake vs. Aurora Serverless V2 (Postgres) – Quantamatics query performance and cost comparison</h2><p>The following steps were performed to migrate data from Snowflake to Aurora Serverless v2:</p><ul><li>Use the Snowflake <code>COPY INTO &lt;location&gt;</code> command to copy the data from the Snowflake database table into one or more files in an S3 bucket.</li><li>Create tables in Aurora Serverless. Use the <a href="https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/postgresql-s3-export.html#aws_commons.create_s3_uri&quot;&gt;&lt;code&gt;create_s3_uri&lt;/code&gt;&lt;/a&gt; function to load variables.</li><li>Use the <a href="https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Importing.html#USER_PostgreSQL.S3Import.FileFormats&quot;&gt;&lt;code&gt;aws_s3.table_import_from_s3&lt;/code&gt;&lt;/a&gt; function to import the data file from an Amazon S3 file name prefix.</li><li>Verify that the information was loaded.</li></ul><p><a href="https://aws.amazon.com/blogs/database/export-and-import-data-from-amazon-s3-to-amazon-aurora-postgresql/&quot;&gt;This blog post</a> describes importing data from Amazon S3 to Amazon Aurora PostgreSQL.</p><p><strong>Testing strategy:</strong> Run the corresponding CLI database utility for each database (<code>snowsql</code> vs <code>psql</code>) from within the VPC. Run the same query on each dataset. Return and write the results as CSV to a local file.<strong>Data set size:</strong> ~178,000,000 rows<strong>Result set size:</strong> ~418,000 rows</p><table class="c7" border="1" style="width: 1240px;"><tbody><tr><td class="c6"><strong>Data source</strong></td><td class="c6"><strong>Configuration</strong></td><td class="c6"><strong>Results</strong></td></tr><tr><td><strong>Snowflake</strong></td><td><strong>Snowflake:</strong> Medium Warehouse (running), AWS based in same Region as APIs<ul><li>Cost: ~$0.01 per query based on credit usage</li></ul></td><td><ul><li>21.99 seconds run time</li><li>3.36 seconds run time, 18.63 seconds transfer time</li></ul></td></tr><tr><td><strong>Aurora Serverless V2(Postgres)</strong></td><td>Idling on four Aurora Compute Units (ACU)<ul><li>Cost: ~$0.24 an hour</li><li>Tables and indexes tuned for Quantamatics use cases</li></ul></td><td><ul><li>7.00 seconds run time</li><li>3.58 seconds run time, 3.42 seconds transfer time</li></ul></td></tr></tbody></table><h2>Conclusion</h2><p>The customer was able to achieve similar run times for the given dataset and query, but faster transfer speeds from Aurora Serverless due to the locality of the database. They also realized up to ~40x runtime cost savings by using Aurora Serverless—1,000 queries in Aurora Serverless vs. ~24 queries in Snowflake for the same cost.</p><p><strong>Note:</strong> These results are specific to Quantamatics use cases where queries are fixed and well-known, and relatively limited in terms of complexity. This allowed the tables and database in Aurora Serverless v2 to be tuned for those specific purposes.</p><p>AWS recommends customers review their workloads using the <a href="https://console.aws.amazon.com/wellarchitected/&quot;&gt;AWS Well-Architected Tool</a> to help ensure that their workloads are performant, secure, and cost-optimized. Well-Architected Framework Reviews are excellent opportunities to work together with your AWS account team and key stakeholders to discuss how modern infrastructure can help you win in the market.</p></section>

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Facteus Quantamatics 架构优化 Amazon EKS Aurora Serverless v2
相关文章