热点
关于我们
xx
xx
"
拒绝行为
" 相关文章
打开AI黑箱:如何用归因图绘制大语言模型的脑回路?
智源社区
2025-06-04T02:03:07.000000Z
Latent Adversarial Training (LAT) Improves the Representation of Refusal
少点错误
2025-01-06T13:34:27.000000Z
45 Shades of AI Safety: SORRY-Bench’s Innovative Taxonomy for LLM Refusal Behavior Analysis
MarkTechPost@AI
2024-07-02T21:16:44.000000Z