热点
关于我们
xx
xx
"
语言模型越狱
" 相关文章
Anthropic has a new way to protect large language models against jailbreaks
MIT Technology Review » Artificial Intelligence
2025-02-03T16:40:34.000000Z
Avoiding jailbreaks by discouraging their representation in activation space
少点错误
2024-09-28T02:22:44.000000Z