cs.AI updates on arXiv.org 07月30日 12:12
Analysis of Threat-Based Manipulation in Large Language Models: A Dual Perspective on Vulnerabilities and Performance Enhancement Opportunities
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文分析了大型语言模型(LLMs)在威胁性操纵下的复杂反应,揭示其脆弱性和性能提升的潜在机会。通过引入新型威胁分类法和多指标评估框架,对3,390个实验响应进行量化分析,发现系统性的脆弱性和角色威胁下的高指标显著性,以及大量案例中的显著性能提升。

arXiv:2507.21133v1 Announce Type: cross Abstract: Large Language Models (LLMs) demonstrate complex responses to threat-based manipulations, revealing both vulnerabilities and unexpected performance enhancement opportunities. This study presents a comprehensive analysis of 3,390 experimental responses from three major LLMs (Claude, GPT-4, Gemini) across 10 task domains under 6 threat conditions. We introduce a novel threat taxonomy and multi-metric evaluation framework to quantify both negative manipulation effects and positive performance improvements. Results reveal systematic vulnerabilities, with policy evaluation showing the highest metric significance rates under role-based threats, alongside substantial performance enhancements in numerous cases with effect sizes up to +1336%. Statistical analysis indicates systematic certainty manipulation (pFDR

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLMs 威胁性操纵 性能提升 脆弱性 多指标评估
相关文章