cs.AI updates on arXiv.org 5小时前
Entropy-Lens: The Information Signature of Transformer Computations
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

研究Transformer模型中token-level分布的演变,通过计算Shannon熵获得可解释的熵轮廓,揭示模型计算模式,并证明其与输出正确性相关。

arXiv:2502.16570v2 Announce Type: replace-cross Abstract: Transformer models map input token sequences to output token distributions, layer by layer. While most interpretability work focuses on internal latent representations, we study the evolution of these token-level distributions directly in vocabulary space. However, such distributions are high-dimensional and defined on an unordered support, making common descriptors like moments or cumulants ill-suited. We address this by computing the Shannon entropy of each intermediate predicted distribution, yielding one interpretable scalar per layer. The resulting sequence, the entropy profile, serves as a compact, information-theoretic signature of the model's computation. We introduce Entropy-Lens, a model-agnostic framework that extracts entropy profiles from frozen, off-the-shelf transformers. We show that these profiles (i) reveal family-specific computation patterns invariant under depth rescaling, (ii) are predictive of prompt type and task format, and (iii) correlate with output correctness. We further show that R\'enyi entropies yield similar results within a broad range of $\alpha$ values, justifying the use of Shannon entropy as a stable and principled summary. Our results hold across different transformers, without requiring gradients, fine-tuning, or access to model internals.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Transformer模型 熵分析 计算模式 Shannon熵 模型可解释性
相关文章