热点
"防御手段" 相关文章
Anthropic has a new way to protect large language models against jailbreaks
MIT Technology Review » Artificial Intelligence 2025-02-03T16:40:34.000000Z
Model alignment protects against accidental harms, not intentional ones
AI Snake Oil 2024-12-13T05:08:43.000000Z