Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs

cs.AI updates on arXiv.org 07月11日 12:04

Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs

本文通过实验研究LLMs认知偏差的来源，发现偏差主要受预训练影响，并提出跨调优方法以隔离偏差来源。

arXiv:2507.07186v1 Announce Type: cross Abstract: Large language models (LLMs) exhibit cognitive biases -- systematic tendencies of irrational decision-making, similar to those seen in humans. Prior work has found that these biases vary across models and can be amplified by instruction tuning. However, it remains unclear if these differences in biases stem from pretraining, finetuning, or even random noise due to training stochasticity. We propose a two-step causal experimental approach to disentangle these factors. First, we finetune models multiple times using different random seeds to study how training randomness affects over $30$ cognitive biases. Second, we introduce \emph{cross-tuning} -- swapping instruction datasets between models to isolate bias sources. This swap uses datasets that led to different bias patterns, directly testing whether biases are dataset-dependent. Our findings reveal that while training randomness introduces some variability, biases are mainly shaped by pretraining: models with the same pretrained backbone exhibit more similar bias patterns than those sharing only finetuning data. These insights suggest that understanding biases in finetuned models requires considering their pretraining origins beyond finetuning effects. This perspective can guide future efforts to develop principled strategies for evaluating and mitigating bias in LLMs.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLMs 认知偏差预训练跨调优

相关文章

学习获得快速进步的一个关键是：尽快承认自己不行。承认自己不行，不是指丧失自信，而是以一种白纸心态，从零开始。这是一种不小看对手的谦逊心态。很多人是相...

Learning Visiolinguistic Representations with ViLBERT w/ Stefan Lee - #358

Researchers from Cerebras & Neural Magic Introduce Sparse Llama: The First Production LLM based on Llama at 70% Sparsity

黑天鹅：如何应对不可预知的未来

我已经吐槽过好几次现在自媒体喜欢宣传轻易的成功这事儿有多有毒了。宣传轻易的成功，会让人觉得大多数事儿都很简单，大多数人都比我优秀，就我成一件「简单」...

FinRobot: A Novel Open-Source AI Agent Platform Supporting Multiple Financially Specialized AI Agents Powered by LLMs

做AI的精英们，千万不要忘了全人类的平均智商是什么水平。你费尽心思把ai打磨到能帮人写论文的copilot程度，但老百姓压根不需要写论文。对普通老百姓，你哪怕...

HuggingFace Releases ? FineWeb: A New Large-Scale (15-Trillion Tokens, 44TB Disk Space) Dataset for LLM Pretraining

Show HN: 让开发人员方便使用 LLM 的 CLI

奇特的人图式理论 "如何帮助我们在工作中做出更好的决策