Local Specialization

Summary

Local specialization in deep neural networks refers to the extent to which specific parts of a network’s computational structure can be understood as performing distinct, comprehensible sub-tasks that contribute to the overall task. The paper proposes methods to quantify this specialization by examining clusters of neurons and evaluating their importance (how crucial they are to network performance) and coherence (how consistently they associate with input features). The researchers develop statistical techniques based on neuron interpretation methods and apply them to neuron partitions created through spectral clustering of network weights or activation correlations. Their findings suggest that graph-based partitioning can effectively reveal local specialization, even when based solely on network weights. This approach offers a way to automatically identify groups of neurons that can be understood in abstract terms, providing insights into the functional organization of deep neural networks.

Research Papers