cs.AI updates on arXiv.org 07月28日 12:43
Studying Cross-cluster Modularity in Neural Networks
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出通过聚类性提升神经网络可解释性的方法,通过定义聚类性度量,并使用“聚类性损失”函数训练模型形成非交互式集群,从而提高模型模块化。研究涵盖了CNN、小transformer、GPT-2和Pythia等模型,揭示了聚类模型的新特性。

arXiv:2502.02470v3 Announce Type: replace-cross Abstract: An approach to improve neural network interpretability is via clusterability, i.e., splitting a model into disjoint clusters that can be studied independently. We define a measure for clusterability and show that pre-trained models form highly enmeshed clusters via spectral graph clustering. We thus train models to be more modular using a "clusterability loss" function that encourages the formation of non-interacting clusters. We then investigate the emerging properties of these highly clustered models. We find our trained clustered models do not exhibit more task specialization, but do form smaller circuits. We investigate CNNs trained on MNIST and CIFAR, small transformers trained on modular addition, and GPT-2 and Pythia on the Wiki dataset, and Gemma on a Chemistry dataset. This investigation shows what to expect from clustered models.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

神经网络 可解释性 聚类性 模型训练 模块化
相关文章