cs.AI updates on arXiv.org 20小时前
How to Protect Models against Adversarial Unlearning?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了AI模型反学习问题,分析了恶意方通过发送反学习请求以最大程度降低模型性能的现象,并提出了一种保护模型性能的新方法。

arXiv:2507.10886v1 Announce Type: cross Abstract: AI models need to be unlearned to fulfill the requirements of legal acts such as the AI Act or GDPR, and also because of the need to remove toxic content, debiasing, the impact of malicious instances, or changes in the data distribution structure in which a model works. Unfortunately, removing knowledge may cause undesirable side effects, such as a deterioration in model performance. In this paper, we investigate the problem of adversarial unlearning, where a malicious party intentionally sends unlearn requests to deteriorate the model's performance maximally. We show that this phenomenon and the adversary's capabilities depend on many factors, primarily on the backbone model itself and strategy/limitations in selecting data to be unlearned. The main result of this work is a new method of protecting model performance from these side effects, both in the case of unlearned behavior resulting from spontaneous processes and adversary actions.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI模型 反学习 模型性能
相关文章