未知数据源 2024年10月02日
Enable self-service visual data integration and analysis for fund performance using AWS Glue Studio and Amazon QuickSigh...
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章介绍了IMM基金的投资特点及监管要求,传统银行系统的局限性,以及如何利用云技术和相关服务进行数据分析。包括采用AWS服务实现近实时分析、可视化和应对市场波动等内容。

🧊IMM基金是投资于高流动性工具、现金及现金等价物的共同基金,对美国金融稳定至关重要,受严格监管,市场压力下基金经理可采取流动性费用或赎回限制措施。

💻传统银行使用遗留系统,数据和业务逻辑紧密耦合,分析师和基金经理难以进行自助服务和获取实时分析,难以应对市场波动和满足监管要求。

☁️许多银行采用云技术,将遗留工作负载迁移,以降低成本、提高竞争力。AWS提供多种服务,如Glue Studio和QuickSight,可帮助进行近实时分析和可视化。

📊通过AWS CloudFormation模板创建资源,包括部署、创建连接等步骤,实现对IMM基金交易的近实时分析,以应对市场波动并满足监管要求。

<section class="blog-post-content"><p>IMM (Institutional Money Market) is a mutual fund that invests in highly liquid instruments, cash, and cash equivalents. IMM funds are large financial intermediaries that are crucial to financial stability in the US. Due to its criticality, IMM funds are highly regulated under the security laws, notably Rule 2a-7, Which states that during market stress, fund managers can impose a liquidity fee up to 2% or redemption gates (a delay in processing redemption) if the fund’s weekly liquid assets drop below 30% of its total assets. The liquidity fees and gates allow money market funds to stop heavy redemption in times of market volatility.</p><p>Traditional banks use legacy systems and rely on monolithic architectures. Typically, data and business logic is tightly coupled on the same mainframe machines. It’s hard for analysts and fund managers to perform self-service and gather real-time analytics from these legacy systems. They work on the previous nightly report and struggle to keep up with market fluctuations. The slightest modification to the reports on these legacy systems involves vast costs, time, and significant dependency on the software development team. Due to these limitations, analysts and fund managers can’t respond effectively to market trends and face a tremendous challenge in adhering to the regulatory requirements of monitoring the market volatility.</p><p>Over the last few years, many banks have adopted the cloud. Banks have migrated their legacy workloads to reduce cost, improve their competitive advantage, and address competition from FinTech and startups. As part of the cloud strategy, many mainframe applications got re-platformed or re-architected to a more efficient database platform. However, many opportunities exist in modernizing the application. One such option is to enable self-service to run real-time analytics. AWS offers various services that help such use cases. In this post, we demonstrate how to analyze fund performance visually using AWS Glue Studio and QuickSight in a self-service fashion.</p><p>The aim of the post is to assist operations analysts and fund managers to self-service their data analysis needs without previous coding experience. This post demonstrates how <a href="https://aws.amazon.com/glue&quot; target="_blank" rel="noopener noreferrer">AWS Glue Studio</a> reduces the software development team’s dependency and helps analysts and fund managers perform near-real-time analytics. This post also illustrates how to build visualizations and quickly get business insights using <a href="https://aws.amazon.com/quicksight&quot; target="_blank" rel="noopener noreferrer">Amazon QuickSight</a>.</p><h2>Solution overview</h2><p>Most banks record their daily trading transactions activity in relational database systems. A relational database keeps the ledger of daily transactions that involves many buys and sells of IMM funds. We use the mock trades data and a simulated Morningstar data feed to demonstrate our use case.</p><p>The following sample <a href="http://aws.amazon.com/rds&quot; target="_blank" rel="noopener noreferrer">Amazon Relational Database Service</a> (Amazon RDS) instance records daily IMM trades, and Morningstar market data gets stored in <a href="http://aws.amazon.com/s3&quot; target="_blank" rel="noopener noreferrer">Amazon Simple Storage Service</a> (Amazon S3). With AWS Glue Studio, analysts and fund managers can analyze the IMM trades in near-real time and compare them with market observations from Morningstar. They can then review the data in <a href="http://aws.amazon.com/athena&quot; target="_blank" rel="noopener noreferrer">Amazon Athena</a>, and use QuickSight to visualize and further analyze the trade patterns and market trends.</p><p>This near-real time and self-service enables fund managers quickly respond to the market volatility and apply fees or gates on IMM funds to comply with Rule 2a-7 regulatory requirements.</p><p>The following diagram illustrates the solution architecture.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image001.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34348" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image001.png&quot; alt="" width="782" height="429" /></a></p><h2>Provision resources with AWS CloudFormation</h2><p>To create your resources for this use case, we deploy an <a href="http://aws.amazon.com/cloudformation&quot; target="_blank" rel="noopener noreferrer">AWS CloudFormation</a> template. Complete the following steps:</p><ol><li>Choose <strong>Launch Stack</strong> (<code>in us-east-1</code>):<a href="https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/template?stackName=fund-analysis&amp;amp;templateURL=https://aws-bigdata-blog.s3.amazonaws.com/artifacts/analyze_fund_performance_using_glue/fund_blog_.yml&quot; target="_blank" rel="noopener noreferrer"><img class="alignnone wp-image-29651 size-full" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/06/01/launchstack.png&quot; alt="" width="107" height="20" /></a></li><li>Choose <strong>Next</strong> three times to reach the <strong>Review</strong> step.</li><li>Select <strong>I acknowledge that AWS CloudFormation might create IAM resources</strong>.</li><li>Choose <strong>Create stack</strong>.</li></ol><h2>Create an AWS Glue connection</h2><p>You create an AWS Glue connection to access the MySQL database created by the CloudFormation template. An AWS Glue crawler uses the connection in the next step.</p><ol><li>On the AWS Glue console, under <strong>Databases</strong> in the navigation pane, choose <strong>Connections</strong>.</li><li>Choose <strong>Add connection</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image006.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34349 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image006.png&quot; alt="" width="540" height="146" /></a></li><li>For <strong>Connection name</strong>, enter <code>Trade-Analysis</code>.</li><li>For <strong>Connection type</strong>¸ choose <strong>JDBC</strong>.</li><li>Choose <strong>Next</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image009.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34350 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image009.png&quot; alt="" width="799" height="358" /></a></li><li>For <strong>JDBC URL</strong>, enter your URL.To connect to an <a href="https://aws.amazon.com/rds/mysql/&quot; target="_blank" rel="noopener noreferrer">Amazon RDS for MySQL</a> data store with a DBDEV database, use the following code:<p>For more details, see <a href="https://docs.aws.amazon.com/glue/latest/dg/connection-defining.html&quot; target="_blank" rel="noopener noreferrer">AWS Glue connection properties</a>. Refer to the CloudFormation <code>fund-analysis</code> stack <strong>Outputs</strong> tab to get the Amazon RDS ARN.</p><p>The next step requires you to first retrieve your MySQL database user name and password via <a href="https://aws.amazon.com/secrets-manager/&quot; target="_blank" rel="noopener noreferrer">AWS Secrets Manager</a>.</p></li><li>On the Secrets Manager console, choose <strong>Secrets</strong> in the navigation pane.</li><li>Choose the <code>secret rds-secret-fund-analysis</code>.</li><li>Choose <strong>Retrieve secret value</strong> to get the user name and password.</li><li>Return to the connection configuration and enter the user name and password.</li><li>For <strong>VPC</strong>, choose the VPC ending with <code>fund-analysis</code>.</li><li>For <strong>Subnet</strong> and <strong>Security groups</strong>, choose the values ending with <code>fund-analysis</code>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image011.png&quot;&gt;&lt;img class="alignnone wp-image-34351 size-full c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image011.png&quot; alt="" width="558" height="524" /></a></li><li>Choose <strong>Next</strong> and <strong>Finish</strong> to complete the connection setup.</li><li>Select the connection you created and choose <strong>Test Connection</strong>.</li><li>For <strong>IAM role</strong>, choose the role <code>AWSGlueServiceRole-Studio</code>.</li></ol><p>For more details about using <a href="http://aws.amazon.com/iam&quot; target="_blank" rel="noopener noreferrer">AWS Identity and Access Management</a> (IAM), refer to <a href="https://docs.aws.amazon.com/glue/latest/ug/setting-up.html&quot; target="_blank" rel="noopener noreferrer">Setting up for AWS Glue Studio</a>.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image013.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34352 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image013.png&quot; alt="" width="414" height="166" /></a></p><h2>Create and run AWS Glue crawlers</h2><p>In this step, you create two crawlers. The crawlers connect to a data store, determine the schema for your data, and then create metadata tables in your AWS Glue Data Catalog.</p><h3>Crawl MySQL data stores</h3><p>The first crawler creates metadata for the MySQL data stores. Complete the following steps:</p><ol><li>On the AWS Glue console, choose <strong>Crawlers</strong> in the navigation pane.</li><li>Choose <strong>Add crawler</strong>.</li><li>For <strong>Crawler name</strong>, enter <code>Trades Crawlers</code>.</li><li>Choose <strong>Next</strong>.</li><li>For <strong>Crawler source type</strong>, choose <strong>Data stores</strong>.</li><li>For <strong>Repeat crawls of S3 data stores</strong>, choose <strong>Crawl all folders</strong>.</li><li>Choose <strong>Next</strong>.</li><li>For <strong>Choose a data store</strong>, choose <strong>JDBC</strong>.</li><li>For <strong>Connection</strong>, choose <strong>Trade-Analysis</strong>.</li><li>For <strong>Include path</strong>, enter the MySQL database name (<code>DBDEV</code>).</li><li>Choose <strong>Next</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image015.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34353 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image015.png&quot; alt="" width="492" height="381" /></a></li><li>For <strong>Add another data store</strong>, choose <strong>No</strong>.</li><li>Choose <strong>Next</strong>.</li><li>For the IAM role to access the data stores, choose the role <code>AWSGlueServiceRole-Studio</code>.</li><li>For <strong>Frequency</strong>, choose <strong>Run on demand</strong>.</li><li>Choose <strong>Add database</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image017.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34354 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image017.png&quot; alt="" width="452" height="150" /></a></li><li>For <strong>Database name</strong>, enter <code>trade_analysis_db</code>.</li><li>Choose <strong>Create</strong>.</li><li>Choose <strong>Next</strong>.</li><li>Review all the steps and choose <strong>Finish</strong> to create your crawler.</li><li>Select the <code>Trades Crawlers</code> crawler and choose <strong>Run crawler</strong> to get the metadata.</li></ol><h3>Crawl Amazon S3 data stores</h3><p>Now you configure a crawler to create metadata for the Amazon S3 data stores.</p><ol><li>On the AWS Glue console, choose <strong>Crawlers</strong> in the navigation pane.</li><li>Choose <strong>Add crawler</strong>.</li><li>For <strong>Crawler name</strong>, enter <code>Ratings</code>.</li><li>Choose <strong>Next</strong>.</li><li>For <strong>Crawler source type</strong>, choose <strong>Data stores</strong>.</li><li>For <strong>Repeat crawls of S3 data stores</strong>, choose <strong>Crawl all folders</strong>.</li><li>Choose <strong>Next</strong>.</li><li>For <strong>Choose a data store</strong>, choose <strong>S3</strong>.</li><li>For <strong>Connection</strong>, choose <strong>Trade-Analysis</strong>.</li><li>For <strong>Include path</strong>, enter <code>s3://aws-bigdata-blog/artifacts/analyze_fund_performance_using_glue/Morningstar.csv</code>.</li><li>Choose <strong>Next</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image019.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34356 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image019.png&quot; alt="" width="728" height="634" /></a></li><li>For <strong>Add another data store</strong>, choose <strong>No</strong>.</li><li>Choose <strong>Next</strong>.</li><li>For the IAM role to access the data stores, choose the role <code>AWSGlueServiceRole-Studio</code>.</li><li>For <strong>Frequency</strong>, choose <strong>Run on demand</strong>.</li><li>Choose <strong>Add database</strong>.</li><li>For <strong>Database name</strong>, enter <code>trade_analysis_db</code>.</li><li>Review all the steps and choose <strong>Finish</strong> to create your crawler.</li><li>Select the <code>Ratings</code> crawler and choose <strong>Run crawler</strong> to get the metadata.</li></ol><h2>Review crawler output</h2><p>To review the output of your two crawlers, navigate to the <strong>Databases</strong> page on the AWS Glue console.</p><p>You can review the database <code>trade_analysis_db</code> created in previous steps and the contents of the metadata tables.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image021.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34357 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image021.png&quot; alt="" width="589" height="100" /></a></p><h2>Create a job using AWS Glue Studio</h2><p>A job is the AWS Glue component that allows the implementation of business logic to transform data as part of the extract, transform, and load (ETL) process. For more information, see <a href="https://docs.aws.amazon.com/glue/latest/dg/add-job.html&quot; target="_blank" rel="noopener noreferrer">Adding jobs in AWS Glue</a>.</p><p>To create an AWS Glue job using AWS Glue Studio, complete the following steps:</p><ol><li>On the AWS Glue console, in the navigation pane, choose <strong>AWS Glue Studio</strong>.</li><li>Choose <strong>Create and manage jobs</strong>.</li><li>Choose <strong>View jobs</strong>.AWS Glue Studio supports different sources. For this post, you use two AWS Glue tables as data sources and one S3 bucket as the destination.</li><li>In the <strong>Create job</strong> section, select <strong>Visual with a blank canvas</strong>.</li><li>Choose <strong>Create</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image023.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-34358 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image023.jpg&quot; alt="" width="935" height="153" /></a>This takes you to the visual editor to create an AWS Glue job.</li><li>Change the job name from <code>Untitled Job</code> to <code>Trade-Analysis-Job</code>.</li></ol><p>You now have an AWS Glue job ready to filter, join, and aggregate data from two different sources.</p><h2>Add two data sources</h2><p>For this post, you use two AWS Glue tables as data sources: <code>Trades</code> and <code>Ratings</code>, which you created earlier.</p><ol><li>On the AWS Glue Studio console, on the <strong>Source</strong> menu, choose <strong>MySQL</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image024.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34359 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image024.png&quot; alt="" width="519" height="419" /></a></li><li>On the <strong>Node properties</strong> tab, for <strong>Name</strong>, enter Trades.</li><li>For <strong>Node type</strong>, choose <strong>MySQL</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image026.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34360 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image026.png&quot; alt="" width="1639" height="467" /></a></li><li>On the <strong>Data Source properties – MySQL</strong> tab, for <strong>Database</strong>, choose <code>trade_analysis_db</code>.</li><li>For <strong>Table</strong>, choose <code>dbdev_mft_actvitity</code>.Before adding the second data source to the analysis job, be sure that the node you just created isn’t selected.</li><li>On the <strong>Source</strong> menu, choose <strong>Amazon S3</strong>.</li><li>On the <strong>Node properties</strong> tab, for <strong>Name</strong>, enter <code>Ratings</code>.</li><li>For <strong>Node type</strong>, choose <strong>Amazon S3</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image028.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34361 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image028.png&quot; alt="" width="1622" height="455" /></a></li><li>On the <strong>Data Source properties – S3</strong> tab, for <strong>Database</strong>, choose <code>trade_analysis_db</code>.</li><li>For <strong>Table</strong>, choose <code>morning_star_csv</code>.You now have two AWS Glue tables as the data sources for the AWS Glue job.The <strong>Data preview</strong> tab helps you sample your data without having to save or run the job. The preview runs each transform in your job so you can test and debug your transformations.</li><li>Choose the <code>Ratings</code> node and on the <strong>Data preview</strong> tab, choose <strong>Start data preview session</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image030.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34362 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image030.png&quot; alt="" width="1827" height="341" /></a></li><li>Choose the <code>AWSGlueServiceRole-Studio</code> IAM role and choose <strong>Confirm</strong> to sample the data.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image032.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34363 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image032.png&quot; alt="" width="1844" height="489" /></a></li></ol><p>Data previews are available for each source, target, and transform node in the visual editor, so you can verify the results step by step for other nodes.</p><h2>Join two tables</h2><p>A <em>transform</em> is the AWS Glue Studio component were the data is modified. You have the option of using different transforms that are part of this service or custom code. To add transforms, complete the following steps:</p><ol><li>On the <strong>Transform</strong> menu, choose <strong>Join</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image034.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34364 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image034.png&quot; alt="" width="359" height="403" /></a></li><li>On the <strong>Node properties</strong> tab, for Name, enter <code>trades and ratings join</code>.</li><li>For <strong>Node type</strong>, choose <strong>Join</strong>.</li><li>For <strong>Node parents</strong>, choose the <code>Trades</code> and <code>Ratings</code> data sources.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image036.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34365 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image036.png&quot; alt="" width="1636" height="546" /></a></li><li>On the <strong>Transform</strong> tab, for <strong>Join type</strong>, choose <strong>Outer join</strong>.</li><li>Choose the common column between the tables to establish the connection.</li><li>For <strong>Join conditions</strong>, choose symbol from the <code>Trades</code> table and <code>mor_rating_fund_symbol</code> from the <code>Ratings</code> table.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image038.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34366 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image038.png&quot; alt="" width="1631" height="531" /></a></li></ol><h2>Add a target</h2><p>Before adding the target to store the result, be sure that the node you just created isn’t selected. To add the target, complete the following steps:</p><ol><li>On the <strong>Target</strong> menu, choose <strong>Amazon S3</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image040.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34367 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image040.png&quot; alt="" width="443" height="197" /></a></li><li>On the <strong>Node properties</strong> tab, for <strong>Name</strong>, enter <code>trades ratings merged</code>.</li><li>For <strong>Node type</strong>, choose <strong>Amazon S3</strong> for writing outputs.</li><li>For <strong>Node parents</strong>, choose <code>trades and ratings join</code>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image042.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34368 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image042.png&quot; alt="" width="1588" height="619" /></a></li><li>On the <strong>Data target properties – S3 tab</strong>, for <strong>Format</strong>, choose <strong>Parquet</strong>.</li><li>For <strong>Compression type</strong>, choose <strong>None</strong>.</li><li>For <strong>S3 target location</strong>, enter <code>s3://glue-studio-blog- {Your Account ID as a 12-digit number}/</code>.</li><li>For <strong>Data catalog update options</strong>, select <strong>Create a table in the Data Catalog and on subsequent runs, update the schema and add new partitions</strong>.</li><li>For <strong>Database</strong>, choose <code>trade-analysis-db</code>.</li><li>For <strong>Table name</strong>, enter <code>tradesratingsmerged</code>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image044.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34369 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image044.png&quot; alt="" width="704" height="413" /></a></li></ol><h2>Configure the job</h2><p>When the logic behind the job is complete, you must set the parameters for the job run. In this section, you configure the job by selecting components such as the IAM role and the AWS Glue version you use to run the job.</p><ol><li>Choose the <strong>Job details</strong> tab.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image046.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34370 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image046.png&quot; alt="" width="642" height="636" /></a></li><li>For <strong>Job bookmark</strong>, choose <strong>Disable</strong>.</li><li>For <strong>Number of retries</strong>, optionally enter <code>0</code>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image048.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34371 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image048.png&quot; alt="" width="698" height="235" /></a></li><li>Choose <strong>Save</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image050.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34372 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image050.png&quot; alt="" width="1572" height="85" /></a></li><li>When the job is saved, choose <strong>Run</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image052.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34373 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image052.png&quot; alt="" width="1633" height="164" /></a></li></ol><h2>Monitor the job</h2><p>AWS Glue Studio offers a job monitoring dashboard that provides comprehensive information about your jobs. You can get job statistics and see detailed information about the job and the job status when running.</p><ol><li>In the AWS Glue Studio navigation pane, choose <strong>Monitoring</strong>.</li><li>Change the date range to 1 hour using the <strong>Date range</strong> selector to get the recently submitted job.The <strong>Job runs summary</strong> section displays the current state of the job run. The status of the job could be Running, Canceled, Success, or Failed.The <strong>Job run success rate</strong> section provides the estimated DPU usage for jobs, and gives you a summary of the performance of the job. <strong>Job type breakdown</strong> and <strong>Worker type breakdown</strong> contain additional information about the job.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image054.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34374 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image054.png&quot; alt="" width="1840" height="783" /></a></li><li>For get more details about the job run, choose <strong>View run details</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image056.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34375 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image056.png&quot; alt="" width="1591" height="149" /></a></li></ol><h2>Review the results using Athena</h2><p>To view the data in Athena, complete the following steps:</p><ol><li>Navigate to the Athena console, where you can see the database and tables created by your crawlers.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image058.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34376 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image058.png&quot; alt="" width="1048" height="501" /></a>If you haven’t used Athena in this account before, a message appears instructing you to set a query result location.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image060.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34377 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image060.png&quot; alt="" width="629" height="134" /></a></li><li>Choose <strong>Settings,</strong> <strong>Manage</strong>, <strong>Browse S3</strong>, and select any bucket that you created.</li><li>Choose <strong>Save</strong> and return to the editor to continue.</li><li>In the <strong>Data</strong> section, expand <strong>Tables</strong> to see the tables you created with the AWS Glue crawlers.</li><li>Choose the options menu (three dots) next to one of the tables and choose <strong>Preview Table</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image062.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34378 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image062.png&quot; alt="" width="825" height="645" /></a></li></ol><p>The following screenshot shows an example of the data.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image064.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34379 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image064.png&quot; alt="" width="1488" height="470" /></a></p><h2>Create a QuickSight dashboard and visualizations</h2><p>To set up QuickSight for the first time, <a href="https://docs.aws.amazon.com/quicksight/latest/user/signing-up.html&quot; target="_blank" rel="noopener noreferrer">sign up for a QuickSight subscription</a> and <a href="https://docs.aws.amazon.com/quicksight/latest/user/athena.html&quot; target="_blank" rel="noopener noreferrer">allow connections to Athena</a>.</p><p>To create a dashboard in QuickSight based on the AWS Glue Data Catalog tables you created, complete the following steps:</p><ol><li>On the QuickSight console, choose <strong>Datasets</strong> in the navigation pane.</li><li>Choose <strong>New dataset</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image066.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34380 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image066.png&quot; alt="" width="936" height="156" /></a></li><li>Create a new QuickSight dataset called Fund-Analysis with Athena as the data source.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image068.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34382 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image068.png&quot; alt="" width="600" height="305" /></a></li><li>In the <strong>Choose your table</strong> section, choose <code>AwsDataCatlog</code> for <strong>Catalog</strong> and choose <code>trade_analysis_db</code> for <strong>Database</strong>.</li><li>For <strong>Tables</strong>, select the <code>tradesratingmerged</code> table to visualize.</li><li>Choose <strong>Select</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image070.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34383 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image070.png&quot; alt="" width="576" height="472" /></a></li><li>Import the data into SPICE.SPICE is an in-memory engine that QuickSight uses to perform advanced calculations and improve performance. Importing the data into SPICE can save time and money. When using SPICE, you can refresh your datasets both fully or incrementally. As of this writing, you can schedule incremental refreshes up to every 15 minutes. For more information, refer to <a href="https://docs.aws.amazon.com/quicksight/latest/user/refreshing-imported-data.html&quot; target="_blank" rel="noopener noreferrer">Refreshing SPICE data</a>. For near-real-time analysis, select <strong>Directly query your data</strong> instead.</li><li>Choose <strong>Visualize</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image072.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34384 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image072.png&quot; alt="" width="597" height="337" /></a>After you create the dataset, you can view it and edit its properties. For this post, leave the properties unchanged.</li><li>To analyze the market performance from the Morningstar file, choose the clustered bar combo chart under <strong>Visual types</strong>.</li><li>Drag <code>Fund_Symbol</code> from <strong>Fields list</strong> to <strong>X-axis</strong>.</li><li>Drag <code>Ratings</code> to <strong>Y-axis</strong> and <strong>Lines</strong>.</li><li>Choose the default title choose <strong>Edit title</strong> to change the title to “Market Analysis.”The following QuickSight dashboard was created using a custom theme, which is why the colors may appear different than yours.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image074.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34385 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image074.png&quot; alt="" width="594" height="442" /></a></li><li>To display the Morningstar details in tabular form, add a visual to create additional graphs.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image076.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34386 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image076.png&quot; alt="" width="181" height="254" /></a></li><li>Choose the table visual under <strong>Visual types</strong>.</li><li>Drag <code>Fund Symbol</code> and <code>Fund Names</code> to <strong>Group by</strong>.</li><li>Drag <code>Ratings</code>, <code>Historical Earnings</code>, and <code>LT Earnings</code> to <strong>Value</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image078.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34387 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image078.png&quot; alt="" width="642" height="334" /></a>In QuickSight, up until this point, you analyzed the market performance reported by Morningstar. Let’s analyze the near-real-time daily trade activities.</li><li>Add a visual to create additional graphs.</li><li>Choose the clustered bar combo chart under <strong>Visual types</strong>.</li><li>Drag <code>Fund_Symbol</code> from <strong>Fields list</strong> to <strong>X-axis</strong> and <code>Trade Amount</code> to <strong>Y-axis</strong>.</li><li>Choose the default title choose <strong>Edit title</strong> to change the title to “Daily Transactions.”<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image080.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34388 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image080.png&quot; alt="" width="776" height="639" /></a></li><li>To display the daily trades in tabular form, add a visual to create additional graphs.</li><li>Drag <code>Trade Date</code>, <code>Customer Name</code>, <code>Fund Name</code>, <code>Fund Symbol</code>, and <code>Buy/Sell</code> to <strong>Group by</strong>.</li><li>Drag <code>Trade Amount</code> to <strong>Value</strong>.<a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image082.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34389 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image082.png&quot; alt="" width="1732" height="906" /></a></li></ol><p>The following screenshot shows a complete dashboard. This compares the market observation reported in the street against the daily trades happening in the bank.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image084.png&quot;&gt;&lt;img class="alignnone size-full wp-image-34391 c4" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/BDB-2204-image084.png&quot; alt="" width="1323" height="694" /></a></p><p>In the <strong>Market Analysis</strong> section of the dashboard, GMFXXD funds were performing well based on the previous night’s feed from Morningstar. However, the <strong>Daily Transactions</strong> section of the dashboard shows that customers were selling their positions from the funds. Relying only on the previous nightly batch report will mislead the fund managers or operation analyst to act.</p><p>Near-real-time analytics using AWS Glue Studio and QuickSight can enable fund managers and analysts to self-serve and impose fees or gates on those IMM funds.</p><h2>Clean up</h2><p>To avoid incurring future charges and to clean up unused roles and policies, delete the resources you created: the CloudFormation stack, S3 bucket, and AWS Glue job.</p><h2>Conclusion</h2><p>In this post, you learned how to use AWS Glue Studio to analyze data from different sources with no previous coding experience and how to build visualizations and get business insights using QuickSight. You can use AWS Glue Studio and QuickSight to speed up the analytics process and allow different personas to transform data with no development experience.</p><p>For more information about AWS Glue Studio, see the <a href="https://docs.aws.amazon.com/glue/latest/ug/what-is-glue-studio.html&quot; target="_blank" rel="noopener noreferrer">AWS Glue Studio User Guide</a>. For information about QuickSight, refer to the <a href="https://docs.aws.amazon.com/quicksight/latest/user/welcome.html&quot; target="_blank" rel="noopener noreferrer">Amazon QuickSight User Guide</a>.</p><h3>About the authors</h3><p class="c5"><strong><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/rajkarup.png&quot;&gt;&lt;img class="size-full wp-image-34414 alignleft" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/rajkarup.png&quot; alt="" width="100" height="133" /></a>Rajeshkumar Karuppaswamy</strong> is a Customer Solutions Manager at AWS. In this role, Rajeshkumar works with AWS Customers to drive Cloud strategy, provides thought leadership to accelerate businesses achieve speed, agility, and drive innovation. His areas of interests are AI &amp; ML, analytics, and data engineering.</p><p class="c5"><strong><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/richa-kaul.png&quot;&gt;&lt;img class="size-full wp-image-34413 alignleft" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2022/09/15/richa-kaul.png&quot; alt="" width="100" height="94" /></a>Richa Kaul</strong> is a Senior Leader in Customer Solutions serving Financial Services customers. She is based out of New York. She has extensive experience in large scale cloud transformation, employee excellence, and next generation digital solutions. She and her team focus on optimizing value of cloud by building performant, resilient and agile solutions. Richa enjoys multi sports like triathlons, music, and learning about new technologies.</p><p class="c5"><strong><a href="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2021/02/10/Noritaka-Sekiyama-p.png&quot;&gt;&lt;img class="size-full wp-image-16628 alignleft" src="https://d2908q01vomqb2.cloudfront.net/b6692ea5df920cad691c20319a6fffd7a4a766b8/2021/02/10/Noritaka-Sekiyama-p.png&quot; alt="" width="100" height="133" /></a>Noritaka Sekiyama</strong> is a Principal Big Data Architect on the AWS Glue team. He is responsible for building software artifacts to help customers. This summer, he enjoyed goldfish scooping with his children.</p></section>

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

IMM基金 传统银行 云技术 数据分析
相关文章