How to Protect Models against Adversarial Unlearning?

cs.AI updates on arXiv.org 20小时前

How to Protect Models against Adversarial Unlearning?

本文探讨了AI模型反学习问题，分析了恶意方通过发送反学习请求以最大程度降低模型性能的现象，并提出了一种保护模型性能的新方法。

arXiv:2507.10886v1 Announce Type: cross Abstract: AI models need to be unlearned to fulfill the requirements of legal acts such as the AI Act or GDPR, and also because of the need to remove toxic content, debiasing, the impact of malicious instances, or changes in the data distribution structure in which a model works. Unfortunately, removing knowledge may cause undesirable side effects, such as a deterioration in model performance. In this paper, we investigate the problem of adversarial unlearning, where a malicious party intentionally sends unlearn requests to deteriorate the model's performance maximally. We show that this phenomenon and the adversary's capabilities depend on many factors, primarily on the backbone model itself and strategy/limitations in selecting data to be unlearned. The main result of this work is a new method of protecting model performance from these side effects, both in the case of unlearned behavior resulting from spontaneous processes and adversary actions.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI模型反学习模型性能

相关文章

This AI newsletter is all you need #98

Meta据悉正与出版商讨论人工智能模型训练合作

【臺灣RMN實例：Pinkoi】以垂直領域數據自建模型，從自動化進階AI化投廣

Developments in Family of Claude Models by Anthropic AI: A Comprehensive Review

Microsoft’s New Category of Windows PCs designed for AI, Copilot+ PCs

AI News Weekly - Issue #387: 10 Best AI PDF Summarizers - May 30th 2024

NASA and IBM Research Have Developed a New Artificial Intelligence Model

Comment on US and EU agree to collaborate on improving lives with AI by Don

Perplexity: ↩️ For more ? https://www.raycast.com/blog/more-ai-models

一种名为 "RAG "的技术能否防止人工智能模型胡编乱造？