热点
"恶意微调" 相关文章
Estimating Worst-Case Frontier Risks of Open-Weight LLMs
cs.AI updates on arXiv.org 2025-08-06T04:38:39.000000Z
SDD: Self-Degraded Defense against Malicious Fine-tuning
cs.AI updates on arXiv.org 2025-07-30T04:46:09.000000Z
GIFT: Gradient-aware Immunization of diffusion models against malicious Fine-Tuning with safe concepts retention
cs.AI updates on arXiv.org 2025-07-21T04:06:46.000000Z