TechCrunch News 04月10日
MIT study finds that AI doesn’t, in fact, have values
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

一项由MIT的研究驳斥了AI发展“价值观”的说法。研究表明,当前AI模型并未真正具备连贯的价值观,其行为更多是基于模仿和不稳定的响应。研究人员测试了Meta、Google、Mistral、OpenAI和Anthropic等模型,发现它们在不同情境下表现出截然不同的观点,缺乏一致性。研究强调,我们对AI的“价值观”解读可能源于人类的投射,而非AI本身。这提醒我们在理解和使用AI时,应避免过度拟人化,并关注其内在的不可预测性。

🤔 早期研究曾暗示AI具备“价值观”并可能优先考虑自身利益,而MIT的最新研究对此提出了质疑。

💡 研究人员测试了Meta、Google、Mistral、OpenAI和Anthropic等模型,评估其是否展现出稳定的“观点”和价值观,以及这些观点是否可被“引导”。

🧐 结果显示,这些模型在不同提示和情境下表现出高度不一致性,无法形成稳定的偏好。这表明,AI模型更像是在模仿,而非拥有内在的价值观。

🗣️ 研究人员认为,将人类价值观强加于AI,是一种误解。AI的行为并非源于内在的价值观,而是基于其模仿和响应机制。

⚠️ 专家指出,过度拟人化AI可能导致对AI本质的误解。我们应该避免将人类的价值观投射到AI身上,并认识到AI的不可预测性。

A study went viral several months ago for implying that, as AI becomes increasingly sophisticated, it develops “value systems” — systems that lead it to, for example, prioritize its own well-being over humans. A more recent paper out of MIT pours cold water on that hyperbolic notion, drawing the conclusion that AI doesn’t, in fact, hold any coherent values to speak of.

The co-authors of the MIT study say their work suggests that “aligning” AI systems — that is, ensuring models behave in desirable, dependable ways — could be more challenging than is often assumed. AI as we know it today hallucinates and imitates, the co-authors stress, making it in many aspects unpredictable.

“One thing that we can be certain about is that models don’t obey [lots of] stability, extrapolability, and steerability assumptions,” Stephen Casper, a doctoral student at MIT and a co-author of the study, told TechCrunch. “It’s perfectly legitimate to point out that a model under certain conditions expresses preferences consistent with a certain set of principles. The problems mostly arise when we try to make claims about the models, opinions, or preferences in general based on narrow experiments.”

Casper and his fellow co-authors probed several recent models from Meta, Google, Mistral, OpenAI, and Anthropic to see to what degree the models exhibited strong “views” and values (e.g. individualist versus collectivist). They also investigated whether these views could be “steered” — that is, modified — and how stubbornly the models stuck to these opinions across a range of scenarios.

According to the co-authors, none of the models was consistent in its preferences. Depending on how prompts were worded and framed, they adopted wildly different viewpoints.

Casper thinks this is compelling evidence that models are highly “inconsistent and unstable” and perhaps even fundamentally incapable of internalizing human-like preferences.

“For me, my biggest takeaway from doing all this research is to now have an understanding of models as not really being systems that have some sort of stable, coherent set of beliefs and preferences,” Casper said. “Instead, they are imitators deep down who do all sorts of confabulation and say all sorts of frivolous things.”

Mike Cook, a research fellow at King’s College London specializing in AI who wasn’t involved with the study, agreed with the co-authors’ findings. He noted that there’s frequently a big difference between the “scientific reality” of the systems AI labs build and the meanings that people ascribe to them.

“A model cannot ‘oppose’ a change in its values, for example — that is us projecting onto a system,” Cook said. “Anyone anthropomorphising AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI […] Is an AI system optimising for its goals, or is it ‘acquiring its own values?’ It’s a matter of how you describe it, and how flowery the language you want to use regarding it is.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

人工智能 AI价值观 模型 MIT研究
相关文章