Concept Probing: Where to Find Human-Defined Concepts (Extended Version)

cs.AI updates on arXiv.org 07月28日 12:42

Concept Probing: Where to Find Human-Defined Concepts (Extended Version)

本文提出一种基于信息量和规律性的方法，自动识别神经网络模型中适合探测特定概念的层，并通过实证分析验证其有效性。

arXiv:2507.18681v1 Announce Type: cross Abstract: Concept probing has recently gained popularity as a way for humans to peek into what is encoded within artificial neural networks. In concept probing, additional classifiers are trained to map the internal representations of a model into human-defined concepts of interest. However, the performance of these probes is highly dependent on the internal representations they probe from, making identifying the appropriate layer to probe an essential task. In this paper, we propose a method to automatically identify which layer's representations in a neural network model should be considered when probing for a given human-defined concept of interest, based on how informative and regular the representations are with respect to the concept. We validate our findings through an exhaustive empirical analysis over different neural network models and datasets.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

神经网络概念探测自动识别

相关文章

What is a long context window?

Localizing and Editing Knowledge in LLMs with Peter Hase - #679

Are Large Language Models a Path to AGI? with Ben Goertzel - #625

The Benefit of Bottlenecks in Evolving Artificial Intelligence with David Ha - #535

Learning to Ponder: Memory in Deep Neural Networks with Andrea Banino - #528

Deep Learning is Eating 5G. Here’s How, w/ Joseph Soriaga - #525

Vector Quantization for NN Compression with Julieta Martinez - #498

Skip-Convolutions for Efficient Video Processing with Amir Habibian - #496

Natural Graph Networks with Taco Cohen - #440

Neural Ordinary Differential Equations with David Duvenaud - #364