ConceptTracer: Interactive Analysis of Concept Saliency and Selectivity in Neural Representations

Abstract

Neural networks deliver impressive predictive performance across a variety of tasks, but they are often opaque in their decision-making processes. Despite a growing interest in mechanistic interpretability, tools for systematically exploring the representations learned by neural networks in general, and tabular foundation models in particular, remain limited.

In this work, we introduce ConceptTracer, an interactive application for analyzing neural representations through the lens of human-interpretable concepts. ConceptTracer integrates two information-theoretic measures that quantify concept saliency and selectivity, enabling researchers and practitioners to identify neurons that respond strongly to individual concepts.

We demonstrate the utility of ConceptTracer on representations learned by TabPFN and show that our approach facilitates the discovery of interpretable neurons. Together, these capabilities provide a practical framework for investigating how neural networks like TabPFN encode concept-level information.

ConceptTracer is available at https://github.com/ml-lab-htw/concept-tracer.

ConceptTracer: Interactive Analysis of Concept Saliency and Selectivity in Neural Representations

Authors

Abstract

Resources

Stay in the loop

Pages

Tools

Details