AWS Machine Learning Blog 2024年07月18日
How Deloitte Italy built a digital payments fraud detection solution using quantum machine learning and Amazon Braket
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

随着数字商务的增长,欺诈检测对于保护企业和消费者至关重要。机器学习 (ML) 算法可以实时分析海量交易数据,快速识别欺诈活动。这种先进的功能有助于降低财务风险并保护不断扩展的数字市场中的客户隐私。德勤是一家战略性全球系统集成商,在全球拥有超过 19,000 名认证的 AWS 从业人员。它通过参与 AWS 能力计划(包括机器学习)的 29 个能力不断提高标准。本文展示了量子计算算法与 ML 模型相结合的潜力,可以彻底改变数字支付平台中的欺诈检测。我们分享了德勤如何使用 Amazon Braket 构建混合量子神经网络解决方案,以展示这项新兴技术可能带来的收益。

🚀 **量子计算的承诺** 量子计算机有可能彻底改变金融系统,实现更快、更精确的解决方案。与经典计算机相比,从长远来看,量子计算机有望在模拟、优化和 ML 领域具有优势。量子计算机是否可以为 ML 提供有意义的加速是一个正在研究的活跃话题。 量子计算可以在关键领域(如定价和风险管理)执行高效的近实时模拟。优化模型是金融机构的关键活动,旨在确定资产组合的最佳投资策略、分配资本或实现生产力改进。一些优化问题对于传统计算机来说几乎不可能解决,因此使用近似值来在合理的时间内解决这些问题。量子计算机可以在不使用任何近似值的情况下执行更快、更准确的优化。 尽管长期展望,但这种技术的潜在颠覆性意味着金融机构正在寻求通过建立内部量子研究团队、扩展其现有的 ML COE 以包含量子计算或与德勤等合作伙伴合作,在该技术中抢占先机。 在这个早期阶段,客户希望能够访问各种不同的量子硬件和模拟功能,以便运行实验并积累专业知识。Braket 是一项完全托管的量子计算服务,可让您探索量子计算。它提供了对来自 IonQ、OQC、Quera、Rigetti、IQM 的量子硬件的访问权限,以及各种本地和按需模拟器(包括支持 GPU 的模拟器)和用于运行混合量子经典算法(如量子 ML)的基础设施。Braket 与 AWS 服务(如 Amazon Simple Storage Service (Amazon S3) 用于数据存储和 AWS Identity and Access Management (IAM) 用于身份管理)完全集成,客户只需为使用量付费。

🧠 **解决方案概述** 本文的目标是探索量子 ML 的潜力,并提出一个概念性工作流程,该工作流程可以在技术成熟时用作即插即用系统。量子 ML 仍处于起步阶段,本文旨在展示可能的艺术,而不深入探讨具体的安全性考虑因素。随着量子 ML 技术的进步并准备好进行生产部署,强大的安全措施将至关重要。但是,目前,重点是概述一个高级概念架构,该架构可以在技术准备就绪时在未来无缝地适应和运行。 下图显示了使用 AWS 服务实现基于神经网络的欺诈检测解决方案的解决方案架构。该解决方案使用混合量子神经网络实现。神经网络使用 Keras 库构建;量子组件使用 PennyLane 实现。 该工作流程包括用于推理 (A-F) 和训练 (G-I) 的以下关键组件: Ingestion – 实时金融交易通过 Amazon Kinesis Data Streams 传入 Preprocessing – AWS Glue 流式提取、转换和加载 (ETL) 作业使用流进行预处理和轻量级转换 Storage – Amazon S3 用于存储输出工件 Endpoint deployment – 我们使用 Amazon SageMaker 端点来部署模型 Analysis – 交易以及模型推断结果存储在 Amazon Redshift 中 Data visualization – Amazon QuickSight 用于可视化欺诈检测结果 Training data – Amazon S3 用于存储训练数据 Modeling – Braket 环境生成一个用于推理的模型 Governance – Amazon CloudWatch、IAM 和 AWS CloudTrail 分别用于可观察性、治理和可审计性

📊 **数据集** 为了训练模型,我们使用了 Kaggle 上可用的开源数据。该数据集包含 2013 年 9 月欧洲持卡人使用信用卡进行的交易。该数据集记录了在 2 天内发生的交易,在这 2 天内,在总共 284,807 笔交易中检测到 492 起欺诈事件。该数据集表现出明显的类别不平衡,欺诈交易仅占整个数据集的 0.172%。由于数据高度不平衡,在数据准备和模型开发过程中已采取各种措施。 该数据集仅包含数值输入变量,由于保密原因,这些变量已进行主成分分析 (PCA) 变换。 数据仅包含数值输入特征(由于保密原因经过 PCA 变换)和三个关键字段: Time – 每笔交易与第一笔交易之间的时间 Amount – 交易金额 Class – 目标变量,1 表示欺诈,0 表示非欺诈

As digital commerce expands, fraud detection has become critical in protecting businesses and consumers engaging in online transactions. Implementing machine learning (ML) algorithms enables real-time analysis of high-volume transactional data to rapidly identify fraudulent activity. This advanced capability helps mitigate financial risks and safeguard customer privacy within expanding digital markets.

Deloitte is a strategic global systems integrator with over 19,000 certified AWS practitioners across the globe. It continues to raise the bar through participation in the AWS Competency Program with 29 competencies, including Machine Learning.

This post demonstrates the potential for quantum computing algorithms paired with ML models to revolutionize fraud detection within digital payment platforms. We share how Deloitte built a hybrid quantum neural network solution with Amazon Braket to demonstrate the possible gains coming from this emerging technology.

The promise of quantum computing

Quantum computers harbor the potential to radically overhaul financial systems, enabling much faster and more precise solutions. Compared to classical computers, quantum computers are expected in the long run to have to advantages in the areas of simulation, optimization, and ML. Whether quantum computers can provide a meaningful speedup to ML is an active topic of research.

Quantum computing can perform efficient near real-time simulations in critical areas such as pricing and risk management. Optimization models are key activities in financial institutions, aimed at determining the best investment strategy for a portfolio of assets, allocating capital, or achieving productivity improvements. Some of these optimization problems are nearly impossible for traditional computers to tackle, so approximations are used to solve the problems in a reasonable amount of time. Quantum computers could perform faster and more accurate optimizations without using any approximations.

Despite the long-term horizon, the potentially disruptive nature of this technology means that financial institutions are looking to get an early foothold in this technology by building in-house quantum research teams, expanding their existing ML COEs to include quantum computing, or engaging with partners such as Deloitte.

At this early stage, customers seek access to a choice of different quantum hardware and simulation capabilities in order to run experiments and build expertise. Braket is a fully managed quantum computing service that lets you explore quantum computing. It provides access to quantum hardware from IonQ, OQC, Quera, Rigetti, IQM, a variety of local and on-demand simulators including GPU-enabled simulations, and infrastructure for running hybrid quantum-classical algorithms such as quantum ML. Braket is fully integrated with AWS services such as Amazon Simple Storage Service (Amazon S3) for data storage and AWS Identity and Access Management (IAM) for identity management, and customers only pay for what you use.

In this post, we demonstrate how to implement a quantum neural network-based fraud detection solution using Braket and AWS native services. Although quantum computers can’t be used in production today, our solution provides a workflow that will seamlessly adapt and function as a plug-and-play system in the future, when commercially viable quantum devices become available.

Solution overview

The goal of this post is to explore the potential of quantum ML and present a conceptual workflow that could serve as a plug-and-play system when the technology matures. Quantum ML is still in its early stages, and this post aims to showcase the art of the possible without delving into specific security considerations. As quantum ML technology advances and becomes ready for production deployments, robust security measures will be essential. However, for now, the focus is on outlining a high-level conceptual architecture that can seamlessly adapt and function in the future when the technology is ready.

The following diagram shows the solution architecture for the implementation of a neural network-based fraud detection solution using AWS services. The solution is implemented using a hybrid quantum neural network. The neural network is built using the Keras library; the quantum component is implemented using PennyLane.

The workflow includes the following key components for inference (A–F) and training (G–I):

    Ingestion – Real-time financial transactions are ingested through Amazon Kinesis Data Streams PreprocessingAWS Glue streaming extract, transform, and load (ETL) jobs consume the stream to do preprocessing and light transforms Storage – Amazon S3 is used to store output artifacts Endpoint deployment – We use an Amazon SageMaker endpoint to deploy the models Analysis – Transactions along with the model inferences are stored in Amazon Redshift Data visualizationAmazon QuickSight is used to visualize the results of fraud detection Training data – Amazon S3 is used to store the training data Modeling – A Braket environment produces a model for inference GovernanceAmazon CloudWatch, IAM, and AWS CloudTrail are used for observability, governance, and auditability, respectively

Dataset

For training the model, we used open source data available on Kaggle. The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset records transactions that occurred over a span of 2 days, during which there were 492 instances of fraud detected out of a total of 284,807 transactions. The dataset exhibits a significant class imbalance, with fraudulent transactions accounting for just 0.172% of the entire dataset. Because the data is highly imbalanced, various measures have been taken during data preparation and model development.

The dataset exclusively comprises numerical input variables, which have undergone a Principal Component Analysis (PCA) transformation because of confidentiality reasons.

The data only includes numerical input features (PCA-transformed due to confidentiality) and three key fields:

Data preparation

We split the data into training, validation, and test sets, and we define the target and the features sets, where Class is the target variable:

y_train = df_train['Class']x_train = df_ train.drop(['Class'], axis=1)y_validation = df_ validation ['Class']x_ validation = df_ validation.drop(['Class'], axis=1)y_test = df_test['Class']x_test = df_test.drop(['Class'], axis=1)

The Class field assumes values 0 and 1. To make the neural network deal with data imbalance, we perform a label encoding on the y sets:

lbl_clf = LabelEncoder()y_train = lbl_clf.fit_transform(y_train)y_train = tf.keras.utils.to_categorical(y_train)

The encoding applies to all the values the mapping: 0 to [1,0], and 1 to [0,1].

Finally, we apply scaling that standardizes the features by removing the mean and scaling to unit variance:

std_clf = StandardScaler()x_train = std_clf.fit_transform(x_train)x_validation = std_clf.fit_transform(x_validation)x_test = std_clf.transform(x_test)

The functions LabelEncoder and StandardScaler are available in the scikit-learn Python library.

After all the transformations are applied, the dataset is ready to be the input of the neural network.

Neural network architecture

We composed the neural network architecture with the following layers based on several tests empirically:

We apply an L2 regularization on the first layer and both L1 and L2 regularization on the second one, to avoid overfitting. We initialize all the kernels using the he_normal function. The dropout layers are meant to reduce overfitting as well.

hidden = Dense(32, activation ="relu", kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0,01))out_2 = Dense(9, activation ="relu", kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l1_l2(l1=0,001, l2=0,001))do = Dropout(0,3)

Quantum circuit

The first step to obtain the layer is to build the quantum circuit (or the quantum node). To accomplish this task, we used the Python library PennyLane.

PennyLane is an open source library that seamlessly integrates quantum computing with ML. It allows you to create and train quantum-classical hybrid models, where quantum circuits act as layers within classical neural networks. By harnessing the power of quantum mechanics and merging it with classical ML frameworks like PyTorch, TensorFlow, and Keras, PennyLane empowers you to explore the exciting frontier of quantum ML. You can unlock new realms of possibility and push the boundaries of what’s achievable with this cutting-edge technology.

The design of the circuit is the most important part of the overall solution. The predictive power of the model depends entirely on how the circuit is built.

Qubits, the fundamental units of information in quantum computing, are entities that behave quite differently from classical bits. Unlike classical bits that can only represent 0 or 1, qubits can exist in a superposition of both states simultaneously, enabling quantum parallelism and faster calculations for certain problems.

We decide to use only three qubits, a small number but sufficient for our case.

We instantiate the qubits as follows:

num_wires = 3dev = qml.device('default.qubit', wires=num_wires)

‘default.qubit’ is the PennyLane qubits simulator. To access qubits on a real quantum computer, you can replace the second line with the following code:

device_arn = "arn:aws:braket:eu-west-2::device/qpu/ionq/Aria-1"dev = qml.device('braket.aws.qubit',device_arn=device_arn, wires=num_wires)

device_ARN could be the ARN of the devices supported by Braket (for a list of supported devices, refer to Amazon Braket supported devices).

We defined the quantum node as follows:

@qml.qnode(dev, interface="tf", diff_method="backprop")def quantum_nn(inputs, weights):    qml.RY(inputs[0], wires=0)    qml.RY(inputs[1], wires=1)    qml.RY(inputs[2], wires=2)    qml.Rot(weights[0] * inputs[3], weights[1] * inputs[4], weights[2] * inputs[5], wires=1)    qml.Rot(weights[3] * inputs[6], weights[4] * inputs[7], weights[5] * inputs[8], wires=2)    qml.CNOT(wires=[1, 2])    qml.RY(weights[6], wires=2)    qml.CNOT(wires=[0, 2])    qml.CNOT(wires=[1, 2])    return [qml.expval(qml.PauliZ(0)), qml.expval(qml.PauliZ(2))]

The inputs are the values yielded as output from the previous layer of the neural network, and the weights are the actual weights of the quantum circuit.

RY and Rot are rotation functions performed on qubits; CNOT is a controlled bitflip gate allowing us to embed the qubits.

qml.expval(qml.PauliZ(0)), qml.expval(qml.PauliZ(2)) are the measurements applied respectively to the qubits 0 and the qubits 1, and these values will be the neural network output.

Diagrammatically, the circuit can be displayed as:

0: ──RY(1.00)──────────────────────────────────────╭●────┤  <Z>1: ──RY(2.00)──Rot(4.00,10.00,18.00)──╭●───────────│──╭●─┤2: ──RY(3.00)──Rot(28.00,40.00,54.00)─╰X──RY(7.00)─╰X─╰X─┤  <Z>

The transformations applied to qubit 0 are fewer than the transformations applied to qbit 2. This choice is because we want to separate the states of the qubits in order to obtain different values when the measures are performed. Applying different transformations to qubits allows them to enter distinct states, resulting in varied outcomes when measurements are performed. This phenomenon stems from the principles of superposition and entanglement inherent in quantum mechanics.

After we define the quantum circuit, we define the quantum hybrid neural network:

def hybrid_model(num_layers, num_wires):    weight_shapes = {"weights": (7,)}    qlayer = qml.qnn.KerasLayer(quantum_nn, weight_shapes, output_dim=2)    hybrid_model = tf.keras.Sequential([hidden,do, out_2,do,qlayer])    return hybrid_model

KerasLayer is the PennyLane function that turns the quantum circuit into a Keras layer.

Model training

After we have preprocessed the data and defined the model, it’s time to train the network.

A preliminary step is needed in order to deal with the unbalanced dataset. We define a weight for each class according to the inverse root rule:

class_counts = np.bincount(y_train_list)class_frequencies = class_counts / float(len(y_train))class_weights = 1 / np.sqrt(class_frequencies)

The weights are given by the inverse of the root of occurrences for each of the two possible target values.

We compile the model next:

model.compile(optimizer='adam', loss = 'MSE', metrics = [custom_metric])

custom_metric is a modified version of the metric precision, which is a custom subroutine to postprocess the quantum data into a form compatible with the optimizer.

For evaluating model performance on imbalanced data, precision is a more reliable metric than accuracy, so we optimize for precision. Also, in fraud detection, incorrectly predicting a fraudulent transaction as valid (false negative) can have serious financial consequences and risks. Precision evaluates the proportion of fraud alerts that are true positives, minimizing costly false negatives.

Finally, we fit the model:

history = model.fit(x_train, y_train, epochs = 30, batch_size = 200, validation_data=(x_validation, y_ validation),class_weight=class_weights,shuffle=True)

At each epoch, the weights of both the classic and quantum layer are updated in order to reach higher accuracy. At the end of the training, the network showed a loss of 0.0353 on the training set and 0.0119 on the validation set. When the fit is complete, the trained model is saved in .h5 format.

Model results and analysis

Evaluating the model is vital to gauge its capabilities and limitations, providing insights into the predictive quality and value derived from the quantum techniques.

To test the model, we make predictions on the test set:

preds = model.predict(x_test)

Because the neural network is a regression model, it yields for each record of x_test a 2-D array, where each component can assume values between 0 and 1. Because we’re essentially dealing with a binary classification problem, the outputs should be as follows:

To convert the continuous values into binary classification, a threshold is necessary. Predictions that are equal to or above the threshold are assigned 1, and those below the threshold are assigned 0.

To align with our goal of optimizing precision, we chose the threshold value that results in the highest precision.

The following table summarizes the mapping between various threshold values and the precision.

Class Threshold = 0.65 Threshold = 0.70 Threshold = 0.75
No Fraud 1.00 1.00 1.00
Fraud 0.87 0.89 0.92

The model demonstrates almost flawless performance on the predominant non-fraud class, with precision and recall scores close to a perfect 1. Despite far less data, the model achieves precision of 0.87 for detecting the minority fraud class at a 0.65 threshold, underscoring performance even on sparse data. To efficiently identify fraud while minimizing incorrect fraud reports, we decide to prioritize precision over recall.

We also wanted to compare this model with a classic neural network only model to see if we are exploiting the gains coming from the quantum application. We built and trained an identical model in which the quantum layer is replaced by the following:

Dense(2,activation = "softmax")

In the last epoch, the loss was 0.0119 and the validation loss was 0.0051.

The following table summarizes the mapping between various threshold values and the precision for the classic neural network model.

Class Threshold=0.65 Threshold = 0.70 Threshold = 0.75
No Fraud 1.0 1.00 1.00
Fraud 0.83 0.84 0. 86

Like the quantum hybrid model, the model performance is almost perfect for the majority class and very good for the minority class.

The hybrid neural network has 1,296 parameters, whereas the classic one has 1,329. When comparing precision values, we can observe how the quantum solution provides better results. The hybrid model, inheriting the properties of high-dimensional spaces exploration and a non-linearity from the quantum layer, is able to generalize the problem better using fewer parameters, resulting in better performance.

Challenges of a quantum solution

Although the adoption of quantum technology shows promise in providing organizations numerous benefits, practical implementation on large-scale, fault-tolerant quantum computers is a complex task and is an active area of research. Therefore, we should be mindful of the challenges that it poses:

Conclusion

The results discussed in this post suggest that quantum computing holds substantial promise for fraud detection in the financial services industry. The hybrid quantum neural network demonstrated superior performance in accurately identifying fraudulent transactions, highlighting the potential gains offered by quantum technology. As quantum computing continues to advance, its role in revolutionizing fraud detection and other critical financial processes will become increasingly evident. You can extend the results of the simulation by using real qubits and testing various outcomes on real hardware available on Braket, such as those from IQM, IonQ, and Rigetti, all on demand, with pay-as-you-go pricing and no upfront commitments.

To prepare for the future of quantum computing, organizations must stay informed on the latest advancements in quantum technology. Adopting quantum-ready cloud solutions now is a strategic priority, allowing a smooth transition to quantum when hardware reaches commercial viability. This forward-thinking approach will provide both a technological edge and rapid adaptation to quantum computing’s transformative potential across industries. With an integrated cloud strategy, businesses can proactively get quantum-ready, primed to capitalize on quantum capabilities at the right moment. To accelerate your learning journey and earn a digital badge in quantum computing fundamentals, see Introducing the Amazon Braket Learning Plan and Digital Badge.

Connect with Deloitte to pilot this solution for your enterprise on AWS.


About the authors

Federica Marini is a Manager in Deloitte Italy AI & Data practice with a strong experience as a business advisor and technical expert in the field of AI, Gen AI, ML and Data. She addresses research and customer business needs with tailored data-driven solutions providing meaningful results. She is passionate about innovation and believes digital disruption will require a human centered approach to achieve full potential.

Matteo Capozi is a Data and AI expert in Deloitte Italy, specializing in the design and implementation of advanced AI and GenAI models and quantum computing solutions. With a strong background on cutting-edge technologies, Matteo excels in helping organizations harness the power of AI to drive innovation and solve complex problems. His expertise spans across industries, where he collaborates closely with executive stakeholders to achieve strategic goals and performance improvements.

Kasi Muthu is a senior partner solutions architect focusing on generative AI and data at AWS based out of Dallas, TX. He is passionate about helping partners and customers accelerate their cloud journey. He is a trusted advisor in this field and has plenty of experience architecting and building scalable, resilient, and performant workloads in the cloud. Outside of work, he enjoys spending time with his family.

Kuldeep Singh is a Principal Global AI/ML leader at AWS with over 20 years in tech. He skillfully combines his sales and entrepreneurship expertise with a deep understanding of AI, ML, and cybersecurity. He excels in forging strategic global partnerships, driving transformative solutions and strategies across various industries with a focus on generative AI and GSIs.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

量子计算 欺诈检测 机器学习 AWS Braket
相关文章