Neural Machine Unranking

arXiv:2408.05330v3 Announce Type: replace-cross Abstract: We address the problem of machine unlearning in neural information retrieval (IR), introducing a novel task termed Neural Machine UnRanking (NuMuR). This problem is motivated by growing demands for data privacy compliance and selective information removal in neural IR systems. Existing task- or model- agnostic unlearning approaches, primarily designed for classification tasks, are suboptimal for NuMuR due to two core challenges: (1) neural rankers output unnormalised relevance scores rather than probability distributions, limiting the effectiveness of traditional teacher-student distillation frameworks; and (2) entangled data scenarios, where queries and documents appear simultaneously across both forget and retain sets, may degrade retention performance in existing methods. To address these issues, we propose Contrastive and Consistent Loss (CoCoL), a dual-objective framework. CoCoL comprises (1) a contrastive loss that reduces relevance scores on forget sets while maintaining performance on entangled samples, and (2) a consistent loss that preserves accuracy on retain set. Extensive experiments on MS MARCO and TREC CAR datasets, across four neural IR models, demonstrate that CoCoL achieves substantial forgetting with minimal retain and generalisation performance loss. Our method facilitates more effective and controllable data removal than existing techniques.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签