MarkTechPost@AI 2024年09月25日
AdvDGMs: Enhancing Adversarial Robustness in Tabular Machine Learning by Incorporating Constraint Repair Layers for Realistic and Domain-Specific Attack Generation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨对抗机器学习在表格数据中的应用,介绍了通过添加约束修复层生成符合实际关系的对抗示例的新方法,并对其效果进行了评估。

🎯对抗机器学习关注通过对抗样本来测试和增强机器学习系统的韧性。表格数据结构复杂,生成对抗示例面临诸多挑战,如需考虑变量间关系和特定约束。

💡早期模型在用于对抗生成时存在局限性,近期模型通过添加噪声或操纵特征来生成对抗示例,但存在搜索空间受限等问题。

🌟研究者将现有DGMs转化为AdvDGMs,并添加约束修复层,使其生成的对抗数据既改变模型预测,又符合数据集内逻辑规则和关系。

📊研究者在多个真实数据集上测试模型,比较了不同模型的攻击成功率,结果表明约束AdvDGMs在很多情况下表现更优。

Adversarial machine learning is a growing field that focuses on testing and enhancing the resilience of machine learning (ML) systems through adversarial examples. These examples are crafted by subtly altering data to deceive the models into making incorrect predictions. Deep generative models (DGMs) have shown significant promise in generating such adversarial examples, especially in computer vision, where visual data tests model robustness. Extending this technique to other types of data, particularly tabular data, introduces additional challenges due to the need for models to maintain realistic relationships between features. For instance, in domains like finance or healthcare, the generated adversarial examples must conform to domain constraints, which are not straightforward compared to images or text.

One of the most prominent challenges in applying adversarial techniques to tabular data stems from the complexity of its structure. Tabular data is often more intricate than other forms of data because it includes numerous relationships between variables. These variables may represent different data types, such as categorical, numerical, or binary, and require specific constraints. For example, in a financial dataset, a model might need to ensure that an “average transaction amount” does not exceed a “maximum transaction amount.” Failing to respect such constraints results in unrealistic adversarial examples that cannot be used to assess the security of ML models objectively. Existing models for generating adversarial examples in tabular data have frequently struggled with this issue, producing up to 100% unrealistic data.

Various methods have been employed to generate adversarial examples for tabular data. Early models like TableGAN, CTGAN, and TVAE were initially designed to create synthetic tabular datasets for augmentation and privacy-preserving data generation. However, these models have limitations when used for adversarial generation because they must consider the unique domain-specific constraints crucial for ensuring realism in adversarial examples. Recent models have attempted to address this by adding noise to the data or manipulating individual features. Still, this approach limits the search space for adversarial examples, making them less effective in real-world applications.

Researchers from the University of Luxembourg, Oxford University, and Imperial College London introduced a new approach by converting existing DGMs into adversarial DGMs (AdvDGMs) and improving them by adding a constraint repair layer. They aimed to adapt models such as WGAN, TableGAN, CTGAN, and TVAE into versions that could generate adversarial examples while ensuring they conform to the necessary domain constraints. These enhanced models, referred to as constrained adversarial DGMs (C-AdvDGMs), allow researchers to generate adversarial data that not only changes the ML model’s predictions but also adheres to logical rules and relationships within the dataset.

The core advancement of this work lies in the constraint repair layer. This layer checks each generated adversarial example against predefined constraints specific to the dataset. For example, suppose an adversarial example violates a rule, such as one variable exceeding its logical maximum. In that case, the constraint repair layer modifies the example to ensure it satisfies all domain-specific requirements. This process can be integrated during the training of the model or applied post-generation, making the method versatile. Adding this constraint layer does not significantly slow down the model’s performance. It incurs only a minor increase in computation time, such as a 0.12-second delay in some cases.

In evaluating the effectiveness of their proposed models, the researchers tested them on several real-world datasets, including URL, WiDS, Heloc, and FSP. They compared the performance of unconstrained AdvDGMs with their constrained counterparts, C-AdvDGMs, across three popular ML models: TorchRLN, VIME, and TabTransformer. The success rate of attacks, measured as the Attack Success Rate (ASR), was a key metric. For example, the AdvWGAN model, combined with the constraint layer, achieved an impressive ASR of 95% on the Heloc dataset when tested against the TabTransformer model. This result significantly improved over previous attempts to generate adversarial tabular data. In 38 out of 48 test cases, the P-AdvDGMs (models with constraints applied during sample generation) showed a higher ASR than their unconstrained versions, with the best-performing model increasing the ASR by 62%.

The researchers also tested their models against other state-of-the-art (SOTA) adversarial attack methods, including gradient-based attacks like CPGD and CAPGD and a genetic algorithm attack called MOEVA. The constrained AdvDGMs demonstrated superior performance in many cases, particularly in generating more realistic adversarial examples, which made them more effective at deceiving the target ML models. For instance, in nine out of twelve datasets, the genetic algorithm attack MOEVA outperformed gradient-based attacks. Yet, AdvWGAN and its variants still ranked as the second-best performing method on datasets like Heloc and FSP.

In conclusion, this research addresses a crucial gap in adversarial machine learning for tabular data. By introducing a constraint repair layer, the researchers successfully adapted DGMs to generate adversarial examples that deceive ML models and maintain necessary real-world relationships between features. The success of the AdvWGAN model, which achieved a 95% ASR on the Heloc dataset, indicates the potential of this method for improving the robustness of ML models in domains requiring highly structured and realistic adversarial data. This work paves the way for more reliable security assessments in ML systems and demonstrates the importance of constraint adherence in generating adversarial examples.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)

The post AdvDGMs: Enhancing Adversarial Robustness in Tabular Machine Learning by Incorporating Constraint Repair Layers for Realistic and Domain-Specific Attack Generation appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

对抗机器学习 表格数据 约束修复层 攻击成功率
相关文章